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V*l  '4.1 


Systems  of  linear  algebraic  equations  moat  be  solved  at  each  integration  step  in  all  coin* 
tastily  need  methods  for  the  numerical  solution  of  systems  of  stiff  TVFbfbr  ODEs.  Fre¬ 
quently,  a  substantial  portion  of  the  total  computational-work  and  storage  required  to  solve 
stiff  IVFs  is  devoted  to  solving  these  linear  algebraic  systems,  particularly  if  the  systems  are 

large.  Over  the  past  decade,  several  efficient  iterative  methods  have  been  developed  to  solve 

n*.  x  v  ~ -  #  r 

large  sparse  (nonsymmetrie)  systems  of  linear  algebraic  equations.  We  study  the  use  of  a  dam 
of  those  iterative  methods  in  codes  for  stiff  IVFs.  Our  theoretical  estimates  and  preliminary 
numerical  results  show  that  the  use  of  iterative  linear-equation  solvers  in  stiff -ODE  codes 
improves  the  efficiency  •  in  terms  of  both  computational-work  and  storage  •  with  which  a 
tignifleant  dam  of  stiff  IVPs  having  large  sparse  Jacobians  can  be  solved. 
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As  Gear  [35, 36]  sad  any  others  have  noted,  a  major  open  problems  la  scientific  com¬ 
puting  bi  the  solution  of  latte  systems  of  stiff  initial- valoe  problems  (TVPs)  for  ordi¬ 

nary  differential  equations  (OOEs)  of  the  font 

/<»)-/ <**<*)).  y(fs)->s.  (U) 

These  problems  arise  either  directly  la  models  of  physical  systems  (such  as  chemical  kinctica  or 

electrical  networks)  or  indirectly  u  a  step  la  the  solution  of  another  problem  (sach  as  the 
application  of  the  method-of -lines  to  a  system  of  paraboiic  partial  differeatial  equations  [59]). 
Consequently,  the  efficient  solution  of  latte  systems  at  stiff  lVBi  is  at  peat  practical  impor¬ 
tance. 

Although  several  authors  have  investigated  techniques  for  avoiding  implicitness  in  the 
numerical  solution  at  stiff  IVPs,  most  workers  in  the  field  still  spec  with  Stettert  comment 
[79]  that  'til  reasonable  methods  for  stiff  systems  of  ODE*  hive  to  be  implicit*,  except,  possi¬ 
bly,  for  special  dames  of  problems.  That  is,  a  system  of  linear  or  nonlinear  algebraic  equa¬ 
tions  nwv  be  solved  at  each  step  of  the  numerical  integration.  Moreover,  it  seems  that  a 
Newton-like  method  moat  be  used  to  solve  the  nonlinear  systems  to  avoid  a  severe  restriction 
on  the  stepson.  Consequently,  large  systems  at  linear  equations  must  be  solved  in  this  case  as 
wen. 

As  we  explain  in  more  detail  in  #4,  frequently  a  substantial  portion  of  the  total 
computational-work  and  storage  required  to  solve  large  systems  of  stiff  1VT*  is  devoted  to 
solving  systems  of  linear  algebraic  equations.  Therefore,  any  improvement  in  the  efficiency 
with  which  *h—  linear  systems  an  solved  will  directly  improve  the  performance  of  the 
integrator.  Fortunately,  the  linear  algebraic  systems  that  arise  in  large  systems  of  mitt  IVPs 
are  usually  sparse  and  this  property  eaa  be  exploited  to  great  advantage. 

Over  the  part  decade,  several  efficient  iterative  methods  have  been  developed  to  solve 
large  spans  systems  of  linear  algebraic  equations.  The  Krylov  subspace  methods,  of  which  the 
conjugate  gradient  method  [43]  is  a  well-know  example,  have  proven  to  be  particularly 
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effecdvs  for  solving  the  linear  systems  that  arise  In  the  namerleal  solution  of  elliptic  and  para* 
hoik  partial  differential  equations.  (See,  for  example,  p,  10, 13, 14, 16, 17, 24, 42, 34, 55, 57, 
60, 61, 66,  69, 70,  SI,  S3, 86]  and  the  references  therein.)  Therefore,  it  ia  natural  to  con  aider 
the  ne  of  iterative  linear-equation  solvers  in  codes  for  large  systems  of  stiff  IVPs  for  ODEa. 
Not  only  are  iterative  methods  faster  than  direct  solvers  for  many  systems  of  linear  algebraic 
equations,  bat  also  they  require  significantly  leas  storage  than  direct  solvers  in  most  cases.  In 
addition,  the  use  of  iterative  methods  win  ease  some  of  the  restrictions  on  the  steprixe-  and 
ordcreeiection  strategies  used  in  stiff-OOE  codes;  we  believe  that  this  may  improve  the 
efficiency  of  these  codes  as  welL 

The  oatline  of  the  remainder  of  this  paper  is  u  follows.  Ia  12,  we  review  the  numerical 
eolation  of  the  formulas  used  in  many  of  the  most  popular  stiff -ODE  codes,  emphasiz¬ 

ing  the  relstionship  between  the  user  verified  error  tolerance  for  the  IVF  and  the  accuracy 
with  which  the  jwpfofr  formulas  must  be  solved.  We  also  show  that  s  large  class  of  stiff  IVPs 
have  properties  that  make  the  predated  systems  of  linear  algebraic  equations  amenable  to 
solution  by  iterative  methods.  We  then  review  the  'inexact  Newton  methods*  ia  which  the 
systems  of  linear  equations  that  arise  in  Newton's  method  are  solved  approximately,  rather 
than  exactly.  Again,  are  emphasize  the  relationship  between  the  accuracy  with  which  the 
implicit  formulas  and  associated  linear  algebraic  systems  most  be  solved. 

In  O,  we  review  iterative  linear-equation  solvere  with  particular  emphasis  on  two  Krylov 
snbspeee  methods:  the  preconditioned  conjugate  residual  method  for  symmetric  pasitive- 
dednite  systems  and  preconditioned  Orthomin(k)  for  aonaymoetrie  positive-real  systems.*  We 
also  point  oot  how  there  iterative  linear-equation  solvers  can  be  used  in  a  stiff-OOE  coda  that 
does  oot  ezpUdtly  compute  or  non  the  Jacobian  aswdited  with  the  FVP,  and,  ia  particular, 
how  the  linear  systems  can  be  preconditioned  in  this  case. 

Ia  |4,  we  develop  theoretical  estimates  of  the  computational  work  and  oarage  required 
t  A  iwd  mure  wswia  A  is  poSdvo-nof  ottfc  rsspmi  to  a  laew  prodms  if  sod  oojytf  (a .As)  >  0  for  al 
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to  solve  the  spstially-diicrerixcd  two*  and  three-dimensional  Heat  Equation  using  a  stiff -ODE 
solver  that  employs  either  direct  of  iterative  linear-equation  solvere.  In  15.  we  present  numer¬ 
ical  results  for  the  solution  of  the  spatially-discretized  two-  and  three-dimensional  Heat  and 
Convection-Diffusion  Equations  as  well  as  the  30  Stiff  Deteri  Problems  [27,  29]  uaag  stiff- 
ODE  sobers  based  upon  either  direct  or  iterative  linear-equation  sobers.  Both  the  theoretical 
and  numerical  results  look  quite  promising. 

Finally,  in  16,  we  present  our  oondnsions. 

This  paper  complements  the  work  of  Miranker  and  Chem  [64],  Gear  and  Saad  [37],  and 
Brown  and  Hiadmarsk  [3],  who  also  studied  the  use  of  iterative  linear-equation  solvers  in 
niff -ODE  codes.  We  believe  our  development  of  the  properties  at  the  linear  algebraic  systems 
that  arise  in  stiff -ODE  sobers  that  nuhea  these  linear  systems  amenable  to  solution  by  itera¬ 
tive  linear-equation  sobers  is  new,  ss  is  oar  analysis  of  the  relationship  between  the  three 
tolerances  required  in  a  stiff -ODE  code  employing  an  iterative  linear-equation  solver.  In 
addition,  our  theoretical  estimates  sad  numerical  results  extend  the  work  of  the  authors  refer¬ 
enced  above,  and,  in  particular,  show  the  importance  of  preconditioning  in  the  solution  of 
some  large  systems  of  stiff  IVFb. 

Although  their  poiat-of-view  is  distinctly  different,  the  predictor-corrector  methods 
developed  sad  analyzed  by  via  dcr  Houwea  and  Sommcijer  [51]  an  related  to  the  stiff -ODE 
methods  discussed  in  this  paper  and  those  referenced  in  the  preceding  paragraph. 

X  Impllett  Formulas. 

Many  numerical  methods  have  been  developed  during  the  past  few  decades  for  the  solu¬ 
tion  of  systems  of  stiff  IVFb  for  ODEs,  and  this  continues  to  be  sa  active  area  of  research. 
Most  of  these  methods  can  be  classified  ss  bring  in  one  of  three  families:  linear  maitistep 
(mnltiderivative)  methods,  implicit  Runge-Kutta  methods,  and  extrapolation  methods.  Of 
tbess,  the  linear  maitistep  methods  have  so  far  proven  to  be  the  most  soeeesrinl  [27, 29],  with 
the  moat  widely  used  codes  bring  DCFSUB  (33,  34],  GEAR  [44],  EPISODE  [5],  and  LSODB 
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[48],  each  at  which  is  bisod  upon  the  Backward  Differentiation  Formulas  (BDFi)  popularized 
by  Gear  [34].  Therefore,  ia  oar  dhcowioa  of  implicit  farmalas  below,  we  concentrate  on  the 
BDFs,  although  much  of  what  we  say  applies  to  itiff  methods  in  general. 

A  k-etep  BDF  far  the  nintioa  at  (11)  can  be  written  ia  the  fora 

Jm  •  ab.-i  ♦  •  ”  +  +  buMiUJm)-  (2DJ) 

Tables  of  coefficients  tat  these  farmalas  may  be  found  ia  [34].  To  advance  the  numerical 

mlution  from  /„_i  to  /,  -  ta-i  +  A.,  (101)  is  solved  for  the  approximation  y,  to  y(r„)  using 

the  previously  computed  approximations  Bccanse  (2DJ)  is  iaplkit  ia  y„,  aa  equation 

of  the  fora 

F(y.)  -  y.  -  *.*■/  (r,  jfa)  +  e.  -  0  (2HZ) 

most  be  solved  at  each  step  at  the  integration,  where  e„  cob  tains  the  terms  ia  (2DJ)  that  do 

not  depend  upon  ym. 

IL  Aecoraey  Keqolrsaeat  fer  (2X2). 

In  general,  (2&2)  is  nonlinear  and  cannot  be  solved  exactly.  Shampine  [73, 74]  dismisses 
accnracy  requirements  for  this  equation.  He  notes  that  most  sdff-ODE  coda  attempt  to  com* 
pete  aa  approximate  solution,  ,  to  (2J.1)  satisfying 

b.  -7.1  =*  'X-TOL,  (111) 

where  TOL  is  the  user  specified  error  tolerance  for  the  IVP  sad  ct  is  a  positive  constant  (usu¬ 
ally  less  than  1).  Shampine  [74]  presents  a  convincing  argument  that,  for  a  sdff-ODE  solver,  a 
more  appropriate  criterion  is  to  accept  f,  if 

|F(J,)|  *  ex  TOL.  (2X2) 

Not  only  is  this  criterion  more  easily  related  to  the  accnracy  requirement  for  the  IVP,  but  also 

it  is  dmpler  to  implement.  Furthermore,  Shampine  gives  aa  intuitive  argument  that  suggests 

that,  for  moat  stiff  problems, 

b.  -J.I  *  |F(7,)|.  OU) 

However,  Honbak,  Norsett,  and  Thomsen  [30]  demonstrate  that  it  often  takes  more  compote- 
donal  work  to  satisfy  (2X2)  than  (212)  with  little  or  no  gain  in  the  accuracy  of  the  numerical 


solution  of  the  associated  IV?.  Although  we  do  not  address  the  interesting  question  of  which 
at  them  topping  criterion  is  more  appropriate  in  a  stiff-ODE  solver,  we  do  develop  a  bound 
on  ba  -f,  |  in  terms  at  |F(?a)|  similar  to  one  given  by  Williams  [84],  bat  gang  a  Mmcwhat 
different  (and  possibly  simpler)  derivation.  This  bound  and  some  at  the  relations  used  in  its 
derivation  are  important  to  oar  dbcutdon  of  inexact  Newton  methods  and  iterative  linear- 
equation  solvers  below. 

The  validity  of  (2JJ)  is  intimately  related  to  the  stability  of  the  associated  IVP.  Assume 
that  the  IVP  satisfies 

y(y-tj~t)  (20.4) 

for  ail  (tjr)  and  (tjt)  in  the  domain  of  interest,  where  y  and  z  are  teal  vectors,  y  is  a  real  (pos¬ 
sibly  negative)  constant,  and  (*,-)  is  e  teal  inner-product.  This  assumption  is  frequently  made 
in  studying  the  nonlinear  stability  of  formulas  for  stiff  IVPs  as  it  ensures  the  stability  of  the 
IV?  (1.1)  in  the  following  tense.  Let  y(t)  be  a  solution  of  (LI)  and  let  z(t)  satisfy  the  same 
differential  equation  but  have  a  different  initial  value,  t(t*).  If  (2.1.4)  is  satisfied  in  a  domain 
containing  both  y(t)  and  z(t),  then 

b(f )-*(')!  s  e^^b(*s)-*(<4)l.  (2X5) 

when  |*|  is  the  norm  associated  with  the  inner-product  in  (2X4).  We  say  that  the  IV?  is  dis¬ 
sipative  if  and  only  if  y  <  0.  In  this  case,  the  IV?  is  asymptotically  stable  in  the  sense  that 
the  distance  between  y(t)  and  any  neighbouring  solution  of  the  differential  equation,  z(t), 
decreases  exponentially  with  t. 

Inequality  (2X4)  can  also  be  used  to  bound  bs  ~f»  I  terms  of  |F(?«)I-  Assume  that 
(2X4)  holds  at  t  •  »a  in  a  domain  containing  both  y,  and  j„.  By  (2.0.2)  and  (2X4), 

(i-AftiTHy-td ’-*)*  (2X6) 

Hence,  if  l-*,fi,y  >  0,  then,  by  (2 X6)  and  the  Cauchy-Scbwarts  inequality, 

b-*l*  :  TVT  Vb)-n*)l 


i 

if 


Taking y  >7.  and  *“Ja.  we  get 


(2X7) 


tea  which  it  follows  th at,  if  y  *  0,  then  (2X3)  holds  for  say  A,  >  0,  dace  A,  >  0  for  the 
BO  Ft.  Note  also  that,  if  1-hbPaT  >  0,  then  (2X7)  ensures  that  any  solution  of  /(y,)  *•  0  is 
aaiqae  in  the  domain  for  which  (2.1.4)  holds.  Moreover,  if  (2X4)  holds  at  t  “f,  for  all  real 
vector*  y  and  z  aad  if  l-A,B,y  >  0,  then,  by  the  Uniform  Monotoaicity  Theorem  [67],  the 


aaiqae  sofatiao  of  F(y„)  m  0  exists. 

Now  we  consider  in  more  detail  for  which  das*  of  stiff  IVPi  we  caa  expect  tbaconditioo 
1— X,g.y  >  0  to  hold  throa ghost  the  course  of  the  numerical  integration.  First,  note  that,  if 
the  Jacobian  /,(/  j)  exists  aad  is  continuous,  then  the  algebraically  smallest  y  for  which 
(2X4)  holds  is 


where  the  maximum  is  taken  over  all  nonzero  real  vectors  v  aad  all  (tjr)  in  the  domain  of 
interest.  Hence,  F,(y,)  •  I ~KP*f  ,(*•*»)  «*  positive-real  if  1  >  0. 

It  is  easy  to  dow  that 

no  (  Xr(X)  :X  am  dgenrolm  ef  f ,(/  j)  }  s  y, 

where  Jb(X)  is  the  real  pert  of  X.  If  is  symmetric,  then  equality  holds  in  the  last  ine¬ 

quality,  bat,  if  /,(tjp)  is  noasymmetric,  then  th*  inequality  may  be  strict,  as  the  example 
below  demonstrates.  However,  the  proof  of  Theorem  1  of  [39]  caa  be  adapted  easily  to  slum 
that,  for  any  Seed  (tjr)  aad  say  «  >  0,  there  exists  a  real  iaaer-product  aad  aa  associated  y 
satisfying  (2.L8)  for  which 


y  -a  *  max(Jtr(X):X  tmtigttwmlmt  of  f,(t  j))  s*  y. 


(2X9) 


For  IWft  haviag  a  symmetric  Jacobiaa,  it  is  quite  reasonable  to  expect  l-^ft.7  >  0.  In 
feet,  far  a  large  snbelam  of  these  problems,  all  the  eigenvalues  of  the  Jacobian  /,(/  jr)  are 
aonpoaitive,  tern  which  it  follows  that  7  s  0,  whence  I -A,  3,  7  a  1,  since  A^ft,  >  0.  Thus, 
(2X3)  hold*.  On  th*  other  hand,  if  y  >  0,  then  th*  Jacobian  most  have  a  positive  eigenvalue 


arbitrarily  doa*  to  y  in  the  ><«—■■<«  on  interest.  Consequently,  th*  differential  equation  has 
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solutions  whose  component*  any  crow  Uke  t*  in  t  neighbourhood  of  the  eolation  of  the  IVP. 
Hence,  it  is  reason  able  to  expect  s  stiff-ODE  solver  to  choose  s  stepsize  A,  for  which 
l-A.0,7  >  0  to  control  the  accuracy  in  the  potentially  growing  components  of  the  solution. 
In  fact,  if  y  >  0,  one  would  expect  1-A,0.y  to  he  close  to  1,  u  least  if  the  error  tolerance  is 
sufficiently  stringent.  Moreover,  the  numerical  solution  of  (23.2)  requires  that  Fr(ya)  be 
'numerically*  nonsingular.  This  effectively  ensures  that  1-A,0,7  >  0,  provided  that  6,0. 
changes  by  small  increments,  since  this  inequality  hoida  initially  for  ft,  sufficiently  malt  and, 
as  1— *,0,7  is  essentially  the  algebraically  smallest  eigenvalue  of  Ff(ya)  m  l  -A,0,/,(f,,y,), 
1— *,0,7  can  never  approach  or  pass  through  zero.  Thus,  for  IVP*  having  a  symmetric  Jaco¬ 
bian,  it  is  reasonable  to  expect  that  F,(ya)  will  be  positive-definite  and  (24.7)  will  hold  in  the 
event  that  (243)  does  not. 

If  the  Jacobian  /,(r  j»)  is  noasymmetric,  then  the  assumption  that  1-A,0,y  >  0  is  some¬ 
what  more  problematic,  since,  for  a  given  inner-product,  the  associated  7  given  by  (2.13)  may 
be  much  larger  chan  the  real  pert  of  any  of  the  eigenvalues  of  For  example,  consider 

the  differential  equation  y '  »  Ay ,  where 


The  eigenvalues  of  A  are  both  -1,  but,  for  the  usual  Euclidean  inner-product. 


7  ■  max  ■  -1  +  |a  | 

can  be  arbitrarily  large  even  though  the  eigenvalues  of  A  ire  fixed.  In  particular,  if  |a  |  >  1, 
then  A  is  not  negative-rest.  Moreover,  if  s  stiff-ODE  solver  is  used  to  integrate  y"  -  Ay  over 
a  long  time  interval  with  absolute  error  control,  the  numerical  solution  will  decay  exponen¬ 
tially  outside  of  an  initial  tnasient  region  and  A,  will  become  large.  Hence,  for  large  t,  it  is 
reasonable  to  expect  that  1-A,0,7  «  0  and  F,(ja)  ■/  - *,0,3  will  not  be  postive-real. 
Furthermore,  for  the  usual  Euclidean  norm,  the  smallest  constant  c  that  ensures  that 

|y-*|  m  e  |F(y)-F(»)| 

is  |(l  -*,0»«*)~ll,  which  may  be  larger  than  vfia/47.  Thus,  for  Urge  t,  the  residual  is  not  s 


good  estimate  of  the  error  for  this  problem  in  the  usual  Euclidean  norm.  This  is  not  to  any 
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that,  (or  the  uul  Euclidean  inner-produet  aad  all  IV P»  having  nonsymmctric  Jacobians,  the 
eoaditioa  1-*,  fi,  y  >  0  will  be  violated  aad  Ff(y,)  will  not  be  positive- real  or  that  the  resi¬ 
dual  win  be  a  poor  estimate  of  the  error  in  (2J )2),  bot  this  is  the  ease  for  some  problems. 

Oa  the  other  hand,  even  if  the  Jacobian  /7(ratya)  is  soasymaetric,  by  aa  argument  simi¬ 
lar  to  the  one  used  in  the  symmetric  ease,  it  follows  that  it  is  reasonable  to  expect  that  the 
wepaize,  la  a  stiff-ODE  solver  will  be  restricted  by  aeearaey  considerations  to  the  extent 
that  1-X.fi,  Re(X)  >  0  for  all  eigenvalues  X  of  /,(raiya).  In  fact,  for  many  stiff  IVPs, 
Re(X)  *  0  for  aO  the  eigenvalues  of  /,(raiya),  whence  l-A.fi,Re(X)  *  1  without  any  restric¬ 
tion  on  the  stcpsiac  h,.  In  any  event,  if  lHh.fiaRe(X)  >  0  for  all  eigenvalues  X  of  /,(»,  jra), 
then,  by  (1X9),  there  exists  a  real  inner-produet  with  respect  to  which  F,(ya)  is  positive-real, 
although  this  inner-product  may  depend  upon  (»,  jra).  For  example,  tar  the  matrix  A  above,  if 
we  use  the  real  inner-product  (xj)t  m  {xjy),  where  T  -  diog  (82,1)  aad  (v)  is  usual 
Euclidean  inner-product,  then 

y  m  max  ^  ■  —l  +  Uftl 

which,  for  8  «  e/a,  is  within  «  of  -1.  Hence,  for  0  <  8  <  ~r,  A  is  negative-real,  whence 

1*1 

/-1.M  is  positive-real  with  respect  to  the  (v)r  inner-product  for  any  A,  >  0.  Although 
these  observations  may  not  be  of  any  practical  importance  in  the  selection  of  an  error  control 
strategy  far  a  stiff-ODE  solver,  we  believe  that  they  may  be  of  significance  for  the  implemen¬ 
tation  of  iterative  linear-equation  solvers,  as  will  become  evident  in  13.  Moreover,  as  we 
in  that  section,  the  iterative  solvers  that  we  consider  are  guaranteed  to  converge  if 
Fy(jm)  is  positive-teal,  but  may  break-down  otherwise.  Hence,  their  break-down  gives  a  warn¬ 
ing  that  inequality  (11.7)  is  violated. 

IX  Numerical  Solution  af  (X0J). 

For  aonstiff-ODE  solvers,  it  is  common  to  use  functional  iteration  to  solve  (2D 2)  at  to 
employ  an  implicit  formula,  such  as  (2JU),  as  the  corrector  in  a  predictor-corrector  method. 
However,  for  stiff-ODE  solvers,  the  uae  of  either  of  these  techniques  severely  restricts  the 


tfepaze  and  it  is  exactly  this  type  of  restriction  that  most  be  avoided  for  stiff  problems. 
Therefore,  in  mom  stiff-ODE  solvers,  a  chord-Newton  method  is  used  to  solve  (2.0.2)  at  each 
ticp  in  the  integration.  That  is,  given  an  initial  approximation  y*  to  the  solution  y,  of  (2 .02), 
the  system  of  linear  equations 

-y S)  +  F  (y*)  -  0  (222) 

is  solved  repeatedly  until  an  acceptable  approximation  y*  is  computed,  where  IF*  is  an  approxi¬ 
mation  to  the  Newton  iteration  matrix 

r,(y!)  -  /  fc/,(*..y.*>.  (222) 

Frequently,  IF*  »  just  the  Newton  iteration  matrix  retained  from  an  earlier  iteration  on  the 
current  or  a  previous  step.  Of  course,  if  (1J.)  is  linear  and  the  exact  Newton  iteration  matrix 
(2 22)  is  used  at  each  step,  then  (221)  gives  the  solution  to  (2.0.2)  in  one  iteration. 

With  the  exception  of  GEARBI  [47],  all  the  ‘production*  codes  for  stiff  FVPs  known  to 
the  authors  employ  direct  methods  to  solve  the  system  of  linear  algebraic  equations  (22.1). 
For  example,  GEAR,  EPISODE,  and  LSODE  each  use  Gaussian  Elimination  (GE)  with  par¬ 
tial  pivoting,  while  DIFSUB  computes  the  inverse  of  IF*  explicitly.  For  large  systems  of  stiff 
IVPa,  great  savinp  in  both  time  and  storage  can  be  achieved  by  taking  advantage  of  the  spar¬ 
sity  at  the  Jacobian.  This  observation  lead  to  the  development  of  codrs  that  employ  either 
banded  GE  (such  as  GEARB  [45],  GEARIB  [46],  and  LSODE)  or  sparse  GE  (such  as  GEARS 
[77]  and  LSODE S  [49Q. 

Furthermore,  much  of  the  consideration  in  choosing  the  formulas,  strategies,  and  heuris¬ 
tics  in  a  stiff-ODE  solver  is  directed  towards  solving  (202)  as  efficiently  as  passible.  To  this 
end,  mast  stiff  methods  evaluate  the  Jacobian  and  refactor  TF*  as  seldom  as  possible,  since,  as 
explained  in  more  detail  in  1 4,  the  cost  of  these  two  operations  may  dominate  the  computa¬ 
tion.  Hence,  in  most  stiff-ODE  solvers,  W*  remains  unchanged  for  several  consecutive 
integration  steps. 

The  desire  to  avoid  refactoring  Wi  also  affects  the  choice  of  stepsire-  and  order-selection 
strategies  in  a  stiff-ODE  solver.  If  the  stepsixe  or  order  is  changed  from  one  step  to  the  next. 
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St  lea*  one  a f  the  terms  h„  or  ft,  occurring  in  the  Newton  iteration  matrix  (212)  is 
changed  as  wefl.  Therefore,  snleas  W£  is  updated  and  refactored,  it  may  be  a  poor  apprarima- 
thm  to  (231)  Aa  a  result,  the  chord-Newton  iteration  (121)  may  fail  to  converge  or  con* 
verge  too  dowly.  (Note  that  this  observation  applies  to  linear  as  well  as  nonlinear  IVPs.) 
Consequently,  the  iceprize-  and  ord  erne  lection  strategics  in  most  current  stiff -ODE  solvers 
ere  raxricted  by  this  consideration.  For  example,  EPISODE  changes  stepeize  and/or  order 
only  after  a  failed  step  or  when  it  estimates  that  it  can  increase  its  Sepsize  on  the  next  step  by 
a  factor  of  at  least  L3.  In  addition  to  farcing  the  method  to  take  mote  steps  and  function 
evalnattoas  to  integrate  a  problem  than  might  otherwise  be  required,  this  constraint  on  the 
order*  and  stepsize-selectian  strategies  reduces  the  'smoothness*  of  the  dependence  of  the 
actmal  error  committed  by  the  code  in  solving  a  problem  on  the  user  specified  error  tolerance; 
it  is  generally  agreed  [3d]  that  such  'smoothness*  is  a  very  desirable  property  for  aa  ODE 
solver  to  posaeas. 

The  choice  of  variaUe-stepsixe  implementation  of  a  multistep  formula  is  slso  affected  by 
the  consideration  of  how  this  choice  win  effect  the  efficiency  of  the  Newton  iteration.  The 
two  commonly  used  implementations  are  the  fixed-coefficient  implementation  (FCI)  of  Nord- 
aack  [65).  which  is  used  in  DIFSUB,  GEAR,  and  LSODE,  and  the  variable-coefficient  imple- 
omatatioa  (VCI),  which  is  used  in  EPISODE.  For  uoaatiff  problems,  both  theoretical  con* 
■derations  and  numerical  testing  have  shown  VCI  to  be  superior  to  FCI  for  the  Adams  for¬ 
mulas.  (See.  for  example,  [28, 38, 32, 73]  and  the  references  therein.) 

However,  this  deer  superiority  of  one  implementation  over  the  other  for  Adams  codes 
does  not  extend  to  stiff  methods  based  upon  the  BDFs.  The  reason  for  this  seems  to  be  that, 
when  VCI  is  used  with  a  k-step  BDF,  the  coefficient  pa  in  (222)  continues  to  change  on  each 
of  the  k-1  steps  following  a  stepaze  change.  Therefore,  unless  w£  is  updated  and  refactored 
on  each  of  these  steps,  it  may  be  a  poor  approximation  to  the  Newton  iteration  matrix  (222). 
On  the  ether  hand,  FCI  does  not  share  this  disadvantage,  since,  for  this  implementation,  ft,  is 
s  constant  that  depends  only  upon  the  formula  being  used.  We  believe  that  it  is  primarily  for 
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this  reason  that  the  numerical  (emits  is  [27,  29]  indicate  that  GEAR  is  more  efficient  than 
EPISODE.  On  the  other  hand,  the  numerical  results  in  [6,  7]  suggest  that  EPISODE  is  more 
robust  than  GEAR.  This  empirical  observation  is  supported  by  the  theoretical  results  in  [38], 
which  show  that  VG  is  more  stable  than  EG  for  the  BDFs.  !v  *  [S3]  for  a  more  detailed  dis- 
eunha  of  this  topic.) 

The  coat  o f  so  bring  the  implicit  equation  (2D.2)  also  affects  the  choice  of  formulas  used 
in  a  stiff -ODE  solver.  For  example,  although  A-stable  for  arbitrarily  high  orders,  the  classical 
Runge-Kutta  formulas  (IRKFs)  [4]  suffer  the  major  disadvantage  that  the  implicit  sys¬ 
tem  of  equations  associated  with  an  S-stage  formula  is  S  times  as  large  as  the  corresponding 
system  (2jD2)  for  the  BDFs. 

There  has  been  a  considerable  effort  during  the  past  decade  to  alleviate  some  of  the 
difficulties  discussed  above  associated  with  solving  an  implicit  equation  of  the  form  (2JJ.2)  at 
each  step  of  the  integration  of  a  stiff  ODE.  However,  one  approach  that  has  has  only  recently 
begun  to  be  investigated  actively  is  the  use  of  iterative  methods  to  solve  (2£.l)  [3, 37, 64]. 

Far  parabolic  PDEs.  iterative  methods  have  been  popular  since  the  early  days  of  comput¬ 
ing:  SOR  and  ADI  have  been  used  effectively  for  several  decades  [80].  More  recently,  the 
conjugate  gradient  method  [2, 16, 17, 55]  has  received  a  considerable  amount  of  attention.  We 
believe  that  the  use  of  iterative  methods  in  stiff -ODE  codes  should  be  investigated  as  well.  It 
appears  that  these  methods  offer  a  great  potential  for  reducing  the  cost  -  in  terms  of  both 
time  and  storage  •  of  solving  large  systems  of  stiff  IVPi  having  sparse  Jacobians.  Furthermore, 
aa  dlacnaaod  in  more  detail  below,  the  use  of  iterative  methods  may  alleviate  some  of  the  con¬ 
straints  on  the  stepsixe-  and  order-selection  strategies  discussed  above. 

13.  Inexact  Newton  Methods. 

To  begin,  note  that  the  linear  equation  (2JU)  is  solved  only  to  obtain  an  approximate 
solution  to  the  nonlinear  equation  (2J02):  there  is  no  reason  why  a  direct  linear-equation 
solver  must  be  used  in  •  stiff -ODE  code  to  solve  (&2J.).  Moreover,  Sherman  [76]  and  Dembo, 


Eiaessut,  tad  Steihang  (15]  show  that  it  is  only  necessity  to  approximate  the  sotntioa  of  thaae 
linear  equations  'sufficiently  accurately*  to  obtain  a  quadratic  rate  of  convergence  for  the 
Newton  iteration. 

More  specifically,  consider  the  data  of  inexact  Newton  methods  (15].  Given  an  initial 
(Mas  ji,  any  soefa  method  compotes  a  sequence  of  values  {yi}  satisfying  the  recursion 

|F,<yi)<y.*  **-**>  +  F(ya*)|  *  n»  |F(ya*)l.  (23.1) 

when  %  *  lUa  <  1.  In  the  next  section,  we  diseoat  the  use  of  iterative  methods  to  compote 

(y£*i~y!)  satisfying  (23.1),  but,  independently  of  how  yi*1  is  determined,  Dembo.  Eiscnatat, 

and  Steihang  (15]  prove  that,  if 

(1)  F(ya)-  0. 

(2)  F  is  continuously  differentiable  in  a  neighbourhood  of  ya , 

(3)  ff(y*)  is  aonsingular,  and 

(4)  |y  •  ~ym  |  is  snfBdentty  small. 

then  yt-  ym  with  a  rate  of  convergence  that  is  at  team  linear.  In  addition,  they  show  that 

(a)  y i-  ym  supcriincarly  if  n»-  0, 

(b)  ym  with  strong  order  at  lea*  1+p,  0<pssl,ifiu-O(|jF  (y*)F)  and  F,  is  Holder 
continuous  with  exponent  p  at  ya,2  and 

(e)  yt-  y„  with  weak  order  at  least  1+p,  0  <  p  *  1,  if  F,  is  Holder  continuous  with 
exponent  p  at  y„  and  0  with  weak  order  at  least  1+p. 

Taking  p  ■  1  in  (b),  we  get  that  an  Inexact  Newton  method  may  retain  the  quadratic  rate  of 
convergence  characteristic  of  Newton’s  method. 

Even  though  it  is  not  necessary  to  factor  or  invert  F,  (yf)  in  an  inexact  Newton  method, 
it  ia  necessary  to  evaluate  the  Jacobian  of  the  IV?,  /y(ra  jf),  to  compote  F,(yf)  on  each  iters* 
tkm.  For  large  problems,  the  evaluation  of  the  Jacobian  may  be  very  expensive,  and. 


consequently,  should  be  avoided  whenever  possible.  Therefore,  we  consider  the  class  of  inex¬ 
act  chord-Newton  methods  for  which  (232)  is  replaced  by 

I W*(y*+,-y*)  +F(y!) |  m  tu  |F(y.*)|,  (232) 

where,  as  in  the  previous  subsection,  is  an  approximation  to  But  in  this  case,  if  an 

iterative  method  is  used  to  solve  (23.2),  there  is  little  additional  cost  associated  with  using  the 

current  value  of  the  scalar  in  W*.  although  the  Jacobian  may  remain  unchanged  from 

one  inexact  chord-Newton  iteration  to  the  next.  In  any  cue,  the  proof  of  Theorem  23  in  [13] 

can  be  adapted  easily  to  show  that  yi-  y,  linearly  for  an  inexact  chord-Newton  method  if,  in 

addition  to  (l)-{4)  above,  we  assume  that  Wf  is  a  good  approximation  to  Ff(ym)  in  the  sense 

that 

|W.*-F,(y,)|  *  f  *nd  l(W*)-,-^0r.)“,|  *  y. 
where  y  is  the  constant  appearing  in  the  similar  inequalities  (23)  and  (24),  respectively,  of 

[15]- 

Like  a  chord-Newton  method,  the  rate  of  convergence  of  an  inexact  chord-Newton 
method  is  not  supcrlinear  in  general.  This  together  with  the  convergence  results  quoted  above 
suggest  that  an  appropriate  choice  for  tu  is  a  constant  n  <  1,  since  (in  theory  at  least)  there  is 
no  benefit  in  allowing  tu-  0,  as  there  is  for  an  inexact  Newton  method,  while  allowing  tu  -  0 
makes  the  acceptance  criterion  (232)  more  stringent  and,  consequently,  more  expensive  to 
satisfy  for  an  iterative  linear-equation  solver. 

In  choosing  a  value  for  q,  it  is  useful  to  note  that,  in  many  stiff  ODE  solvers  such  as 
GEAR,  EPISODE,  and  LSODE,  y*  is  normally  a  very  good  initial  approximation  to  ym  in  the 
sense  that  both  ly„*  ~ym  i  and  |F(ya*)|  are  doae  to  TOL,  the  user  specified  error  tolerance  for 
the  IVP,  since  y*  is  computed  by  an  explicit  formula  of  the  same  order  u  the  implicit  correc¬ 
tor.  As  a  result,  usually  only  one  or  two  iterations  of  (222)  are  required  to  compute  yi  satis¬ 
fying  either  (222)  or  (222).  To  avoid  an  excessive  number  of  evaluations  of  F(yf)  when 
using  an  inexact  chord-Newton  method  to  solve  (232),  we  also  require  that,  on  most  steps, 
only  one  or  two  iterations  of  (232)  bo  used  to  compute  an  acceptable  y*.  Furthermore,  note 
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that 

r(yi+*)  *  r,bi)<yi*l-yt)  +  r(ri)  *  witi+'-yi)  +  *(,*). 

Hence,  if  |F  (y,*)!  *  TOL  and  we  west  yi  to  satiffy  (112),  then  a  reaaonabie  value  for  n  is 
r  <1,  when  r  <  1  is  a  positive  cooat  ant  and  c ,  ia  the  constant  appearing  is  (112).  As  alterna¬ 
tive  is  to  replace  (132)  by 

|Wj(rf*‘-y*)  +  F(y!)\  *  rxfTOL .  (133) 

aisce  we  require  only  that  yi  satisfy  the  acceptance  criterios  (112)  asd  sot  that  yi  ultimately 

converges  toy,. 

Based  upon  the  relationship  be  twees  |yj-y,  |  asd  lF(y,f)|  developed  is  1 11,  it  also 
seems  appropriate  to  use  cither  (132)  with  n  ■  r  «i  or  (233)  as  the  acceptance  criterios  for 
as  inexact  chord-Ncwt  oa  method  when  the  accept  as  ce  criterios  for  the  implicit  equation 
(102)  is  (211)  rather  than  (112),  although  the  justification  is  more  tenuous  is  this  case. 
However,  our  numerical  rests  reported  in  13,  baaed  upon  a  modified  verrios  of  LSOOE  which 
employe  LSODE's  acceptance  criterios  of  the  form  (114)  far  (232)  asd  the  acceptance  cri¬ 
terion  (233)  far  the  inexact  chord-Ncwt  on  method,  show  that  this  heuristic  worts  quite  well 
is  practice. 

A  stopping  criterios  of  the  form  (133)  tar  the  inexact  Newton  method  is  also  used  by 
Brown  and  Hindmanh  (31  in  their  modified  version  of  LSOOE.  In  addition,  they  prove  a 
result  about  the  iterates  yi,  which,  although  apparently  not  tight,  suggests  that  the  stopping 
criterion  (133)  is  appropriate  tar  stiff -ODE  solvers. 

Finally,  we  note  that  the  accuracy  of  the  approximation  yi  to  y,  affects  not  only  the 
accuracy  and  stability  of  the  underlying  implicit  ODE  formula  [38]  but  also  other  formulas, 
strategies,  and  heuristics  used  in  the  ODE  solver.  For  example,  in  oar  preliminary  numerical 

teats  with  a  modified  verrios  of  LSODE,  we  found  that  yi  ■  yi  often  satisfied  (233),  parties- 

/ 

larly  on  the  initial  steps  of  the  integration.  However,  accepting  yi  ■  yi  has  a  deleterious 
effect  upon  the  code,  dace  the  error  estimate  in  LSODE  is  based  upon  the  difference  between 
yi  and  the  accepted  yi  and,  moreover,  the  stepsixe-  and  order-selection  strategies  are  baaed 
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upon  the  magnitude  of  the  error  estimate.  Hence,  the  error  may  be  grossly  underestimated 
and  too  large  a  stepcize  selected  for  the  neat  step.  We  were  able  to  avoid  this  difficulty  in 
part  by  taking  rather  than  0,  as  the  initial  guess  for  yi*l-yi  in  the  iterative  solution 

of  (233).  Moreover,  as  this  initial  guess  corresponds  to  the  usual  corrector  in  a  predictor- 
corrector  method,  it  produces  a  good  initial  approsimatioa  to  the  nonstiff  components  of 
(233).  The  choice  of  a  good  initial  guess  for  is  discussed  in  more  detail  for  linear 

systems  at  IVPi  in  [64],  where,  in  oar  notation,  they  consider  initial  guesses  far  at  the 

form  -(/+A  +  •  •  •  +AJ)F(yS)  where  A  - 1  -W*  and  J  *  0.  However,  the  effect  of  the  accu¬ 
racy  at  the  approximation  y*  to  y,  on  the  formulas,  strategies,  and  heuristics  of  an  ODE  solver 
dearly  requires  much  more  study,  not  only  for  methods  employing  inexact  Newton  methods, 
but  also  for  all  methods  based  upon  implicit  formulas. 

3.  Iterative  Linear-  Equations  Solvers. 

In  this  section,  we  discum  the  choice  of  iterative  methods  for  solving  the  systems  of 
linear  equations  thst  arise  in  inexact  chord-Newton  methods  (232).  Because  these  iterative 
methods  mum  function  as  a  component  of  a  general  purpose  stiff -ODE  solver,  it  is  essential 
that  they  perform  effectively  for  general  spam  systems  of  linear  equations  and  are  not  depen¬ 
dent  upon  any  special  matrix  properties  such  as  those,  for  example,  associated  with  the  five- 
point  operator  for  the  two-dimensional  Lapiacian.  This  consideration  immediately  eliminates 
PDE-related  methods  such  as  ADI  or  multi-grid.  Moreover,  even  for  the  application  of  the 
method -of-lines  to  parabolic  problems,  many  of  these  FDE-telated  methods  are  unsuitable 
because  they  require  specific  information  about  the  FDE  itself  (e-g.,  grid  structure  or  operator 
q>  lift  in  gs)  which  is  not  usually  available  to  a  general-purpose  stiff -ODE  solver. 

Although  the  riaiairal  iterative  methods,  such  as  Jacobi,  Oauss-Seidel,  and  SOR,  are  not 
restricted  to  PDE-related  problems,  they  may  not  converge  if  the  linear  system  is  not  sym¬ 
metric  positive-definite.  Moreover,  theee  methods  sre  often  stow  when  used  on  their  own  and 
are,  therefore,  frequently  coupled  with  an  acceleration  technique  to  improve  their  conver¬ 
gence  rate.  For  example,  the  Symmetric  SOR  (SSOR)  method  [85]  may  be  accelerated  by 
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eithcr  the  Chebyshev  semi-iteration  method  or  Richardson1*  second-order  method  [40].  One 
undesirable  feature  a f  thcee  eecelentioa  techniques  is  the  need  to  estimate  psnmeters  to 
meke  them  effective.  Typically,  these  parameters  depend  upon  the  eigenvalues  of  the 
coefficient  matrix,  which  ue  generally  not  known  to  the  user  a  priori.  However,  adaptive 
Chebyshev  methods,  which  aatomaticaay  estimate  these  parameters,  have  been  developed 
recently  [60, 61]  for  both  symmetric  and  nonsymmetric  problems.  These  methods  may  be  par¬ 
ticularly  effective  (or  time  dependent  problems,  since  the  coefficient  matrix  w, f  of  (2 22)  (and, 
hence,  the  associated  optimal  Chebyshev  parameters  also)  change  slowly  from  step  to  step 
throughout  the  numerical  integration  o f  (1.1).  Moreover,  the  Chebyshev  iteration  it 
guaranteed  to  converge  if  the  required  parameters  ate  chosen  'correctly*  and  if  the  real  part 
o f  each  of  the  eigenvalues  of  W*  is  positive.  As  we  argued  in  the  last  section,  if  this  last  con¬ 
dition  is  not  satisfied,  then  the  stepsize  Jk,  is  almost  surely  too  large  and  should  be  reduced 
until  this  condition  is  satisfied  to  ensure  a  reliable  numerical  integration.  However,  to  date 
we  have  not  investigated  in  depth  the  use  of  adaptive  Chebyshev  methods  in  stiff-ODE 
solvers. 

The  Conjugate  Gradient  (CG)  method  is  posibiy  the  most  well-known  example  of 
another  class  of  iterative  methods  that  has  received  considerable  attention  recently.  CG  was 
originally  proposed  by  Hestenes  and  Sticfel  [43]  u  a  direct  method,  but  it  was  re-introduced 
by  Reid  [68]  as  an  iterative  method  for  large  spam  systems  of  linear  equations.  It  has  proven 
to  be  very  effective  in  the  latter  role  for  a  wide  range  of  problems  arising  from,  for  example, 
geophysical  applications  [71],  elliptic  PDEs  [10, 11, 13, 14, 72],  end  time -dependent  PDEs  [2, 16, 
17,  S3].  We  believe  that  this  darn  of  iterative  methods  is  also  suitable  for  solving  the  linear 
syetsas  that  arise  in  stiff-ODE  reivers.  In  particular,  we  consider  the  Preconditioned  Conju¬ 
gate  Residual  method  [10,  24]  and  one  of  its  generalizations  for  nonsymmetric  problems. 
Preconditioned  Orthomin  [20,  24,  81],  In  the  remainder  of  this  section,  we  give  a  brief 
deacription  of  these  methods. 


34.  TVs  Free— dlttened  C—>g*ta  Reddaal  Method. 

Throughout  this  subsection,  let  A  be  i  symmetric  positive-definite  matrix.  To  colve  the 
aystam  at  linear  equations 

Ax  m  b,  (344) 

the  Conjegate  Residual  (CR)  method,  like  CC,  requires  only  that  the  user  supply  a  routine  to 

ooaeputs  the  matrix-vector  product  Av  for  any  gives  vector  v.  Thus,  CR  can  take  full  advan¬ 
tage  at  the  spanaty  at  A.  However,  the  effectiveness  of  CR  can  often  be  improved  dramati¬ 
cally  by  applying  CR  to  the  equivalent  preconditioned  system 

Ai  -b  (3.L2) 

* 

l— read  at  (3.14),  where  A  mS~lAS~t  is  a  symmetric  positive-definite  matrix  rince  A  is, 
m  -  S'x,  b  •  S~lb,  and  fl  -55'  is  'close'  to  A  (in  a  sense  to  be  made  more  precise  below), 
bat  sober  aerially  lam  'expensive'  than  A  to  invert.  We  refer  to  CR  applied  to  (3.13)  aa  the 
Ptecoedh  toned  Cos  jugate  Residual  (PCR)  method  and  Q  aa  the  preconditioner. 

One  at  the  several  equivalent  forms  of  PCR  is  given  in  Figure  3.14.  Although  any 
inner-product  can  be  need  with  PCR.  the  usual  Euclidean  inner-product  is  most  often  used  in 

If  fl  ■  / ,  then  PCR  reduces  to  CR.  Both  methods  require  the  same  amount  of  storage,  but, 
tar  Q  *  /  ,  PCR  requires  one  additional  wive  of  the  form  fie  *  v  per  iteration.  Note,  though,  that 
the  matrix  5  associated  with  (313)  is  not  required  explicitly.  Alw,  if  Q  #  /,  only  the  residual 
-  Q~\b  -Ax,)  ■  Q~lr,  it  available  in  this  implementation  of  PCR;  it  the  residual  r,  tar  (3.14)  is 
required  alw,  than  either  one  additional  matrix-vector  product  of  the  form  flu  must  be  computed 
per  iteration  or  one  additional  vector  must  be  stored. 

It  is  well- known  [1.  10]  that  PCR  is  aa  optimal  polynomial-based  method  in  the  sense 
that  the  lm  iterate  x,  computed  by  PCR  minrmfrtt 

ift  I  -  (r,  j,)*  -  -  |r,  |fl-t  (344) 

over  the  transiated  preconditioned  Krylov  wbepace 


*e  ♦  <Q’lr+  (flH4)fl-Vn  . . .  .(fl-Uy-fc-V^ 


(34.4) 


Chooses* 


Set  r« 

Solve  Qf,  »  r* 

FOR  iH>  STEP  1  UNTIL  convergence  DO 
Solve  Oft  -Aft. 

-ft  +«,#, 

'(♦i  ■  f,  “ft* 

*i  -  <r,+i#,+d /</,*,) 
h+t  ■  ♦i+*«pi 
END  FOR 


View*  3X1:  The  Preconditioned  Conjugate  Residual  (PCR)  Method. 

whet*,  r,  —  h— Aft  is  the  residual  for  (3X1)  awacisted  with  ft,  for  ft  -Sft, 
r,  •  t— As,  -  S  V,  is  the  corresponding  residual  for  (3X2),  x,  is  the  initial  guess  for  the  solu- 
tiou  of  (3X1),  and  r,  -  S-Az,  is  the  associated  residual. 

The  Preconditioned  Conjugate  Gradient  (PCS)  method  can  be  implemented  in  a  similar 
wap,  but  w*  believe  that,  for  our  application,  PCR  is  more  appropriate  than  PCG.  First,  note 
that  the  inexact  ehord-Newton  method  requires  that  the  residual  of  (2X1)  satisfy  (2J.2). 
Therefore,  (or  this  problem,  CR  is  the  optimal  uapreconditiooed  Krylov  subspacs  method  in 
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the  eense  that  it  mmimhcs  the  son  of  the  residual  over  the  Krylov  tubspece  (3.1.4)  with 
Q  ■/.  Oa  the  other  head,  CO  minim  ires  the  A-aora  at  the  error 

l*“*»l*  "  (?-**(*-*))*  -  (riA^r,)*  -  UilA-u  (3X5) 

rather  thaa  the  resides!  itself,  over  the  same  space  (3X4).  However,  this  advantage  is  par¬ 
tially  lost  for  the  preconditioned  methods,  since  PCR  «« inimizcs  § r,  |fl.,  while  PCG 
| r,  |ri  over  (3X4).  Second,  PCR  can  be  generalized  more  easily  than  PCG  for  aoasymtnetric 
problems,  partly  becaaae  (3.13)  defines  a  norm  for  any  aonsingnlar  matrix  A,  provided  Q  is 
positive-real,  while  (3XS)  does  not.  Moreover,  the  preconditioned  Krylov  sabepace  methods 
discussed  in  the  nest  subsection  which  extend  PCR  to  no  asymmetric  systems  are  capable  of 
minimizing  the  residual  associated  with  (333)  provided  the  preconditioning  is  applied  *on  the 
right*.  Therefore,  we  consider  PCR  only  throughout  the  remainder  at  this  subsection, 
although  similar  results  hold  for  PCG. 

Sines  Mt  is  a  member  of  the  affine  spaee  (33.4),  the  residual  r,  associated  with  (333) 

satisfies 

r,  -  (i  -A  P|.t(A))r«  »  4(A)Pa,  (33.6) 

where  P(.t  is  a  polynomial  at  degree  i-1  and  is  a  polynomial  of  degree  i  that  satisfies 

•  • 

*i(0)m  L  If  A  has  k  distinct  eigenvalues  (which  are  all  positive  since  a  is  symmetric  positive- 

o 

definite),  we  can  choose  a  polynomial  A,  at  degree  k  such  that  fi*(0)  - 1  and  A»(X)  «0  for  each 
eigenvalue  h  of  A.  Since  PCR  minimizes  | r,  |  over  (33.4)  and  this  choice  at  makes  | r,  | 
zero,  it  follows  that  PCR,  like  PCG,  converges  to  the  exact  solution  at  (333)  in  at  most  k 
steps.  This  is  a  slightly  sharper  version  of  the  well-known  result  that  PCG  solves  (333)  in  at 
most  M  steps  (assuming  exact  arithmetic  is  used  in  the  computation),  where  M  is  the  dimen- 

don  at  the  system  (333).  Note,  though,  that  A  and  A  may  not  have  the  same  number  of  die- 

* 

finer  eigenvalues.  Ia  particular,  A  may  have  only  a  few  distinct  eigenvalues,  while  A  may  have 
M.  Consequently,  in  preconditioning,  one  must  take  care  nor  to  destroy  an  advantageous 
eigenvalue  distribution. 

More  generally,  one  can  derive  from  (333)  and  (3.13)  the  bound  [10] 
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ft  I  *  | «“  I* (Ml  J  s;.I  (3J*7) 

where  II,  is  the  set  of  polynomials  o f  degree  i  or  less  that  satisfy  R(0)-1  and  {X,}  are  the 
eigenvalue*  at  A,  which  are  also  the  eigenvalues  of  Q~lA  since  these  two  matrices  are  similar. 
Using  the  1*  Chebyshev  polynomial  as  a  particular  choice  tot  R,  one  can  derive  the  following 
hound  [1, 10] 


ftl*  2[1,7yN^^1  I'd 


(3X8) 


where  8(A)  -  XaaB(A)/XaM(A)  is  the  spectral  condition-number  of  A.  Again,  since  A  and 
fl_,A  are  similar,  IT  (A)  m  X{Q ~lA ).  Also,  since  ft  1  -  ft  |fl-u  inequalities  (3X7)  and  (3X8) 
hold  with  ft  |  and  |r*|  replaced  by  ft  |g.t  and  |r#la-i,  respectively.  Similarly,  it  is  well- 
known  [1, 10]  that  both  (3X7)  and  (3.18)  hold  for  PCG  with  ft  |  and  ||r (|  replaced  by  Hr,  |4.| 


and  |r(|4-t,  respectively. 


If  A  is  well-conditioned  or  the  eigenvalues  of  A  are  clustered,  then  CR  reduces  the 

error  in  the  initial  approaimation  very  rapidly.  Therefore,  this  method  can  be  expected  to 

perform  very  effectively  on  the  linear  equations  that  arise  in  mildly  stiff  IV Pi  or  in  large  IVPi 

for  which  the  eigenvalues  of  the  associated  Jacobian  form  a  few  dusters.  In  particular,  CR  is 

* 

well-suited  for  problems  with  a  few  stiff  components  only.  (See  [30,  82]  and  the  references 
therein  for  a  more  detailed  discussion  of  this  latter  daas  of  problem*.)  On  the  other  hand,  if 
A  is  31-canditioned  with  its  eigenvalues  spread  throughout  a  very  large  interval,  then  these 
bounds  suggest  that  CR  may  require  a  great  many  more  iterations  than  PCR  to  generate  an 
acceptable  approximation  to  the  solution  of  (3.1.1).  Since  such  linear  algebraic  systems  arise 
during  the  numerical  solution  at  many  large  systems  at  stiff  IVPs  (in  particularly,  those  that 
arise  tram  the  spatial  discretisation  of  time-dependent  FDEs),  we  believe  that  it  is  necessary 
to  consider  effective  preconditioning*  for  use  with  iterative  linear-equation  solvers  in  codes 


for  stiff  ODE*.  The  importance  of  preconditioning  is  demonstrated  in  the  next  two  sections. 

Among  the  more  popular  preconditfoainp  for  symmetric  positive-definite  systems  are 
SSOR  [42,  83],  the  Incomplete  Chotcsky  (IQ  factorization  [63],  and  the  Modified  Incomplete 
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LU  (VQLU)  factorization  [41],  a  generalization  of  the  Dupont-Kcndall-Rachford  (DKR)  fac- 
torizatioo  [18],  Bach  of  theae  preconditioning  can  be  written  in  the  fora 

Q  »LV  -4  +£. 

when  L  is  a  lower  triangular  matrix  having  the  same  sparsity  structure  as  A,  and  E  is  an  error 
matrix.  These  precooditioninp  do  not  require  mote  storage  than  the  original  matrix  A  and,  if 
implemented  carefully  [19],  may  require  substantially  less.  Furthermore,  to  solve  a  system 
Qm  *v  or  to  compute  Qu  for  any  of  these  preconditionings  does  not  require  more  computa¬ 
tional  work  than  multiplying  a  vector  by  A  and,  when  embedded  in  PCR,  may  requin  sub¬ 
stantially  less  [19]. 

As  explained  in  121,  if  the  Jacobian  is  symmetric,  then  it  is  reasonable  to  expect 

that  the  chord-Ncwton  iteration  matrix  Wj  (L21)  win  be  symmetric  positive-definite.  If  this  is 
not  the  case,  then  the  stepsize  A.  is  almost  surely  too  large  for  the  IV?  and  should  be  reduced 
until  Wi  is  positive-definite  to  ensure  a  reliable  numerical  integration.  For  w}  symmetric 
positive-definite,  the  SSOR  preconditioning  is  well-defined  and  both  the  1C  and  MDLU  incom¬ 
plete  factorizations  can  usually  be  formed  [56, 62,  63]. 

PCX  based  upon  these  preconditioninp  has  proven  to  be  very  effective  for  solving  the 
linear  equations  associated  with  self-adjoint  elliptic  PDEs  [10].  In  the  next  two  sections,  we 
present  some  theoretical  and  empirical  resales  for  the  spatially-discretized  Heat  Equation 
which  show  that  PCX  is  very  effective  for  this  model  problem  also. 

Although  both  PCG  and  PCX  have  proven  to  be  a  very  effective  methods  for  solving 
symmetric  positive-definite  systems  of  linear  algebraic  equations,  only  recently  have  they  been 
extended  to  solve  more  general  systems  effectively.  As  explained  in  the  previous  subsection, 
the  solution  of  symmetric  indefinite  systems  is  not  of  great  importance  for  it  iff -ODE  solvers. 
Therefore,  we  only  consider  the  solution  of  nonsyametrie  systems  in  this  sabeeerion. 

An  obvious  way  to  extend  either  PCG  or  PCX  to  solve  s  nonsymmetrie  system  Ax  ■  A  is 


to  apply  cither  of  these  methods  to  the  symmetric  positive-definite  normal  equations 
A' Am  m  A'b  or  to  the  related  system  AA'y  ■  A'y .  In  either  case,  though,  for  the  badly 

conditioned  systems  that  arise  in  stiff-ODE  solvers,  this  approach  is  not  attractive  because  it 
frequently  leads  to  a  slow  rate  of  convergence. 

Recently,  several  effective  Krylov  snbspaee  methods  have  bee-.,  developed  which  extend 
ECO  and/or  PCR  to  nonsymmetric  systems.  For  example.  Con  cos  and  Golub  [12]  and 
Widload  [83]  developed  a  technique  known  as  the  Generalized  Conjugate  Gradient  (GCG) 
method  which  uses  the  symmetric  part  of  A,  S  -  +A')»  •»  a  preconditioning.  GCG  is  par¬ 

ticularly  effective  if  a  'fast'  solver  exists  for  S.  Although  this  may  be  the  case  for  many  para¬ 
bolic  problems,  this  method  is  not  well-suited  for  use  in  a  general-purpose  niff -ODE  solver, 
since  theta  is  ao  guarantee  that  systems  of  the  fora  Sx  -  k  can  be  solved  cheaply. 

We  chose  to  base  our  investigation  of  the  use  of  iterative  linear-equation  solvers  in  codes 
for  stiff  IWb  upon  the  Preconditioned  Orthomin(k)  (POR(k»  method  [20,  24,  81],  in  exten¬ 
sion  at  PCR  to  nonsymmetric  systems,  partly  because  Elman’s  codes  [21, 25]  were  available  to 
us  and  partly  because,  like  PCR,  POR(k)  minimizes  the  residual  associated  with  the  precondi¬ 
tioned  system  over  a  subspace  described  in  more  detail  below.  One  of  several  other  alterna¬ 
tive  Krylov  snbspaee  methods  is  discused  by  Gear  and  Said  [37]  and  Brown  and  Hindmarsh 

n. 

In  this  subsection,  we  briefly  outline  POR(k)  and  the  related  Preconditioned  Generalized 
Conjugate  Residual  (PGCR)  method  from  which  it  is  derived;  s  more  detailed  discussion  of 
these  methods  can  be  found  in  (20, 24]. 

Like  PCR,  the  effectiveness  at  POR(k)  can  often  be  improved  dramatically  by  an 
appropriate  choice  of  preconditioning.  However,  rince  POR(k)  is  applicable  to  nonsymmetric 
systems,  there  is  more  flexibility  in  the  choice  of  preconditioning  for  POR(k)  than  there  is  for 
PCR.  More  specifically ,  the  preconditioned  system  associated  with  (3-1.1)  may  be  of  the  form 

Am  -i  (3il) 

*  m 

when  A  -  QilAQ{1,  x  m  Q^x,  k  -  Q  f'i ,  Qi  and  Qj  are  substantially  lea  'expensive*  to 
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invert  than  A,  and  the  preconditioning  matrix  Q  m  QiQi  is  'dose*  to  A  (in  a  sense  to  be  made 
more  precise  below).  In  this  formulation,  the  preconditioning  (3.1.2)  used  with  PCG  or  PCR  is 
equivalent  to  a  symmetric  positive-definite  split  preconditioning  having  Qi  **  Q\ .  Two  other 
pT^gntf  forms  of  preconditioning  (3.2.1)  are  worth  noting:  preconditioning  on  the  left  only 
with  Qim  l  and  preconditioning  on  the  right  only  with  (2t  *  /. 

The  prototype  of  the  PGCR  family  of  methods  from  which  PORQt)  is  derived  is  shown 
in  Figure  331  The  expression  for  at  used  there  is  mathematically  equivalent  to  the  erptca- 
■on  given  in  Figure  31.1,  bat  Elman  [24]  believes  that  the  former  is  less  sensitive  to  roundoff 
error  for  naasymmctric  problems. 

Choose  Xf 

Set  r*  ■  h-Axt. 

Compete  r»  ■ 

Compete  p%  “ 

FOR  iHI  STEP  1  UNTIL  convergence  DO 

a,  -  ft  32  flAp, )!{QixApt-Q  fU* ) 

*<-H  m*t  +a*Pi 
ft* |  -  r,  -OtQC'Ap, 

Compote  p,M t. 

END  FOR 

Vlgue  344:  The  Prototype  of  the  Preconditioned  Generalized  Conjugate  Residual 

(PGCR)  Family  of  Methods. 

The  two-term  recurrence 

Pt+i  m  ?t+\ + (342) 
used  in  PCR  generates  an  A'  A  -orthogonal  sequence  at  march  directions  {pt}  provided  A  is  sym¬ 
metric  positive 'definite.  However,  to  obtain  such  a  sequence  for  A  nonsymmetric,  it  appears  to  be 
nummary  to  explicitly  orthogonalize  pl4.t  against  all  previous  search  directions  pj  in  generaL  The 
recurrence  recommended  by  Elman  [20, 24]  for  PGCR  is 
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.  J,.,  _  (Qi1MIalrt+i&{*Ap,) 

fl!  r,*‘  5.v"  *'  *  <ac%,  ac‘»,) 


(3 22) 


PGCR 


of  the  prototype  method  given  in  Figure  3.2.1  together  with  these  lest  two  equations 


to  compote  a  >t. 

The  recurrence  (33J)  requires  the  storage  of  all  past  search  directions  Pj  as  well  ss  far 
more  computation  than  (333).  This  any  be  prohibitively  expensive  for  large  problems.  In 
POR(k),  the  truncated  recurrence 


Pt*\  -  +  2  b\Pl  •  it  »  max(0^  -i  +1),  (33.4) 

is  used  instead,  where  by  is  computed  u  in  (333).  That  is,  p,+x  is  orthogonalized  against  the 
past  k  search  directions  only.  Hence,  POR(k)  requires  the  storage  of  at  most  k  past  search 
directions  and  the  recurrence  (33.4)  is  ngnillcantly  cheaper  to  compute  than  (3 33). 

The  work  per  iteration  for  these  preconditioned  Kxytov  subspace  methods  is  the  same  as 
for  the  unpreconditioned  versions  except  that  Q^AQt1?,^  must  be  computed  in  place  of 
Art+\.  In  computing  the  former  product,  the  intermediate  result  Qf'n+i  can  be  used  to  com¬ 
pute  A+i,and  QilApt+i  can  be  computed  without  say  additional  matrix-vector  multiplies  pro¬ 
vided  that  QflApj  is  saved  instead  of  Pj .  Moreover,  for  SSOR  and  several  of  the  incomplete 
factorizations  (41,  43],  QC1Mlaln*i  can  be  computed  very  efficiently  using  Eiscnstat’s  tech¬ 
nique  [19]. 


If  Ax  ■  b  is  preconditioning  an  the  left  only  (flj  ■  / ),  then  each  of  the  PGCR  family  of 
methods  requires  the  same  amount  of  storage  u  its  corresponding  un preconditioned  version. 
Otherwise,  each  preconditioned  method  requires  one  more  vector  of  storage  than  its 
corresponding  unpreconditioned  version.  However,  the  residual  r,  calculated  in  this  imple¬ 
mentation  at  the  PGCR  family  of  methods  is  the  residual  associated  with  the  preconditioned 
system  (333).  If  the  residual  b  -Ax,  -  r,  -  Qxr,  associated  with  (333)  is  required,  then  the 
storage  advantage  of  preconditioning  on  the  left  only  is  lost:  in  this  case,  each  preconditioned 
method  requires  one  more  vector  of  norage  than  its  corresponding  unpreconditioned  version. 


t 
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K  A  is  podtive  ml,  then  both  PGCR  and  POR(k),  k  *0,  are  convergent  descent 
methods  in  the  tense  that  lr<l-»0  as  /-•  and  |r(«t|  <  |r,J  for  r,  +  0  [20,  24].  More 
specifically,  PGCR,  like  PCR,  minimizes  Jr,  |  over  the  translated  Krylov  sabspecc  (3.L4). 
Hence,  in  this  case  also,  the  rcaidnal  r,  at  the  1**  PGCR  iteration  satisfies 


|r,|snta|R(4)||r||, 


<?2S) 


where  II<  is  the  set  of  polynomials  of  degree  1  or  leas  satisfying  8(0)  -  L  Using  (3 2S),  one 
can  prove  [20, 24]  that 


fcl*  i- 


<n 


Ir'al. 


(3-2-6) 


where  5  ”  16(4  +4').  *b«  symmetric  part  of  4,  is  positive-definite  since  4  is  positive-real  by 
.  3 


This  bound,  though,  is  not  nearly  as  strong  as  (3.1.8)  even  though  PGCR  and 
PCR  compute  identical  iterates  x,  if  A  is  symmetric  positive-definite.  If  A  has  a  complete  set 
at  eigenvectors,  then 

i;,i**(r)M(!;(i.  (32.7) 

•  •  •  .  •  A 

where  X (T )  -  JT  |  |T“lJ  is  the  condition  number  of  the  matrix  T  that  diagonalizes  A, 

*<  -  1*(M-  (3A8) 

■  tuj 

a  a  a  a 

and  (Xj)  ate  the  eigenvalues  of  A .  Note,  if  A  is  normal,  then  JC  (T)  -  1. 

For  1  >6.  the  I*  iterate  x,  computed  by  POR(k)  minimizes  |r(|  over  the  affine 


^-*♦1  *  <Pt-*+u  * "  *  fi-i* 

rather  than  the  foil  translated  Krylov  subapace  (3.L4).  However,  in  this  case  also,  (3i6)  holds 
for  any  k. 

In  aQ  of  the  bounds  Uated  above,  r,  may  be  replaced  by  Q  f lr, ,  since  these  two  vectors 
are  eqnaL  Thus,  one  advantage  of  preconditioning  on  the  right  only  (fli  ■  /)  is  that,  in  this 
case,  the  PGCR  family  of  methods  the  residual  r,  associated  with  the  on  precondi¬ 

tioned  problem  Ax  -  b  at  each  iteration,  since  r,  -  rt.  As  explained  in  the  previous  subsec- 

S  0  <  (a  A*)  *  (rix)  for  a& ml  ooe-MtD  tveton  t,  doe*  A  m  3  +  H  it  poUfvo  rml  mi  (xjfx)  “  0  for 
si  roW  «  brin  N  •  %A  -A')  U  rtrw  wwawrris 
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tioc,  this  teems  to  be  the  most  appropriate  measure  of  the  error  to  be  minimised  by  an  itera¬ 
tive  method  embedded  in  an  inexact  Newton  iteration. 

Provided  that  the  associated  chord-Newton  iteration  matrix  W*  is  positive- real,  these 
bounds  indicate  that,  like  CR,  the  (onpreconditioned)  OCR  family  of  methods  is  very 
effective  for  mildly  stiff  IVP*  and  for  stiff  IVft  for  which  the  eigenvalues  of  the  associated 
Jacobian  form  a  few  dusters.  On  the  other  hand,  if  the  eigenvalues  of  Wj  are  spread 
throughout  a  large  domain,  then,  as  is  demonstrated  in  the  next  two  sections,  the  effectiveness 
of  POR(k)  may  be  improved  dramatically  by  an  appropriate  choice  of  preconditioning.  How¬ 
ever,  care  must  be  taken  in  cheering  a  preconditioning  since,  for  the  more  general  precondi- 

* 

tionings  considered  in  this  subsection,  A  may  fail  to  be  positive. real  even  though  A  is.  One 
advantage  of  using  the  symmetric  positive-definite  split  preconditioning  »  Q[)  is  that  A  is 
positive-real  if  and  only  if  A  is.  Furthermore,  if  A  is  'nearly*  symmetric,  then  so  is  QilAQ{* . 
and,  for  symmetric  problems,  the  iterates  x,  computed  by  POR(k),  *  i  1,  are  identical  to  the 
iterates  computed  by  PCR  and  PGCR.  Intuitively,  if  Q  tlAQ  {*  is  'nearly*  symmetric,  then  we 
expect  the  convergence  rate  of  POR(k)  to  be  dose  to  that  of  PGCR.  On  the  other  hand,  even 
if  A  and  Q  m  Q1Q1  are  both  'nearly*  symmetric,  (Jf'Aflf1  need  not  be,  and  the  convergence 
rate  at  POR(k)  may  be  significantly  slower  than  that  of  PGCR. 

Some  popular  precoaditionings  for  nonsymmetric  systems  ate  SSOR  [42, 85],  the  Incom¬ 
plete  LU  (ILU)  factorization  [63],  and  the  Modified  Incomplete  LU  (MDLU)  factorization  [41]. 
Each  of  these  preconditioning  can  be  written  in  the  form 

Q  m  LU  •A+B. 

where  L  and  U,  respectively,  are  tower  and  upper  triangular  matrices  having  the  same  sparsity 
pattern  as  A.  With  these  factorizations,  it  is  possible  to  precondition  on  the  left  or  right  or  to 
use  a  spilt  preconditioning  with  Qi  mL  and  QtmU. 

POR(k)  with  these  precon ditioninp  has  proven  to  be  very  effective  for  solving  the  sys¬ 
tems  of  linear  algebraic  equations  associated  with  discretized  non-self-td joint  elliptic  PDEs. 
Obviously,  the  smaller  k  is  the  mote  efficient  these  methods  are  in  terms  of  storage  required. 


-27- 


Elman  [24]  abo  found  that  these  methoda  are  most  efficient  in  terms  of  computational  work 
for  *  *  5,  with  *  *1  often  requiring  the  least  amount  of  work. 

As  mentioned  in  12.1,  Wi  is  positive- real  with  respect  to  a  given  inner-product  for  any 
k,  >  0  for  a  large  class  of  stiff  IVPs,  including  all  problems  that  are  dissipative  with  respect  to 
that  inner-product.  But,  for  any  given  inner-product,  there  are  stiff  IVPs  for  which  Wf  is  not 
positive-real  with  respect  to  that  inner-product  for  a  reasonable  choice  of  stepsize,  A,.  In  the 
latter  case,  any  member  of  the  PGCR  family  of  methods  based  upon  that  inner-product  may 
cither  compute  an  acceptable  numerical  solution  or  may  'break-dawn'  during  the  computa¬ 
tion. 

On  the  other  hand,  if  all  the  eigenvalues  of  the  Jacobian  lie  either  in  the  left- 

half  complex- plane  or  on  the  imaginary  axis,  then,  without  any  restriction  on  the  stepuze  A», 
all  the  eigenvalues  of  W*  lie  strictly  in  the  right-half  complex  plane.  Moreover,  as  discussed  in 
124,  even  if  some  of  the  eigenvalues  of  lie  in  the  right-half  complex  plane,  it  is  rea- 

ratable  to  expect  the  stepsize  A.  to  be  constrained  by  the  accuracy  requirements  to  the  extent 
that  all  the  eigenvalues  of  W*  will  lie  strictly  in  the  right-half  complex  plane.  In  either  case,  it 
follows  from  (2.L9)  that  there  cxisa  a  real  inner-product  with  respect  to  which  W*  is  positive- 
real.  We  hope  to  And  a  computationally  effective  way  to  utilize  this  result  to  dynamically 
choose  an  appropriate  inner-product  whenever  is  not  positive-real  with  respect  to  the  usual 
Euclidean  inner-product. 

M.  Jacsbtao-Free  Stlff-ODE  Solvers. 

As  several  authors  have  noted,  it  is  possible  to  avoid  explicitly  computing  and  storing  the 
Newton  iteration  matrix  when  solving  nonlinear  equations  by  an  inexact  Newton  method 
coupled  with  a  Kryiov  sobspace  method.  To  implement  such  a  Newton-Szylov  method,  it  is 
only  necessary  to  be  able  to  compute  Jv  for  any  given  vector  v,  where  J  is  an  approximation  to 
the  Jacobian  /,(*«  j>i).  In  many  stiff -ODE  solvers,  divided  differences  are  used  to  form  J. 
But,  since  J  is  not  needed  explicitly,  the  directional  difference 
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[/  8v)  ~f(tnjrf)]/i 

can  be  used  to  calculate  an  approximation  to  /,(*■*£) v  directly,  where  8  ia  a  scalar  constant. 

Oarg  and  Tapia  [32]  and  O’Leary  [66]  recently  investigated  a  similar  idea  (or  the  applica¬ 
tion  of  CO  to  problems.  O'Leary  shows  that,  ia  addition  to  saving  storage,  the 

Newtoo-CG  employing  directional  differences  requires  leas  computational  work  than 

the  traditional  diaerece-Newtoa  method  for  large  problems. 

Furthermore,  the  teat  results  of  Brown  and  Hindmanh  [3]  baaed  on  the  code  developed 
by  Gear  and  Saad  [37]  demonstrate  that  the  use  of  directional  derivatives  to  approximate 
matrix-vector  products  in  a  Newton-Krylov  iteration  is  very  effective  for  the  spatially- 
discretized  nonlinear  parabolic  problems  that  they  considered. 

However,  all  of  the  prceonditioaiap  referenced  above  require  an  explicit  representation 
of  the  matrix  J.  and  Jackson  [8]  though,  recently  developed  a  class  of  nonlinear  precon¬ 
ditioning,  including  a  variant  of  SSOR,  that  does  not  require  J  explicitly  and  so  can  be  used 
with  Newton-Krylov  methods  employing  directional  differences.  Moreover,  for  their  teat 
problems,  the  nonlinear  SSOR  preconditioning  was  as  effective  as  the  standard  explicit  SSOR 
preconditioning. 

computing  and  storing  Jaeobians  is  a  major  source  of  expense  ia  solving  large  stiff 
the  possibility  of  avoiding  this  computation  seems  very  attractive,  particularly  for  thoae 
ns  for  which  we  can  expect  a  Krylov  subspace  method  to  converge  very  rapidly,  such  as 
IVfc  for  which  the  eigenvalues  of  the  associated  Jacobian  form  a  few  dusters. 


4.  Theoretical  Resalts  for  the  Heal  Equation. 

The  theoretical  results  la  the  last  section  can  be  adapted  easily  to  show  that  the  use  of 
Krylov  subspace  methods  ia  stiff -ODE  solvers  is  very  effective  for  a  large  class  of  IVPa.  Ax  a 
particular  example,  ia  this  section,  we  compare  the  computational-work  and  storage  required 
to  solve  the  spatially-discretized  Heat  Equation  by  five  stiff-ODE  solvers  eaeh  bssed  upon  the 
BDFs  but  using  one  of  the  following  methods  to  solve  the  systems  of  linear  algebraic  equa¬ 
tions  that  arise  at  each  step  ia  the  numerical  integration:  (1)  full  Gaussian  Elimination  (GE), 


(2)  bud  GE.  (3)  (pane  GE,  (4)  the  Conjugate  Residual  (CR)  method,  or  (3)  the  Precondi¬ 
tioned  Conjugate  Residual  (PCR)  method  with  either  the  SSOR  [42, 85]  or  MILU  [41]  precon¬ 
ditioning.  Although  we  do  not  advocate  using  these  methods  to  solve  the  Heat  Equation  in 
practice,  the  spatially-discretized  Heat  Equation  is  a  good  test  problem  from  a  theoretical 
point-of-view  because  it  is  representative  of  a  class  of  large  stiff  IVPs  with  sparse  Jacobiaas 
and  it  css  be  analyzed  thoroughly. 

Consider  the  Heat  Equation  in  one  dimension  (1-D)  with  homogeneous  Dirichlct  boun¬ 
dary  conditions: 

%('.»)"*■(*.*)  /<w(r,s)€(t*»/]x(0,l),  (4.1) 

m(i  ft)  ••  m(t  J)  •  0  /or *€(**//]. 

■<*•*)  -  aeCO  for  *  f  flUl- 

Applying  the  method  of  lines  with  the  usual  centered -difference  approximation  with  stepsize 
A  “  — ^  to  the  spatial  derivative  of  (4.1)  gives  the  linear  system  of  U  -  si  ODEs  y'  “  Ay 

for  r€(r»t/]  with  initial  conditions  y((/|)  ■  a(r«JA)  for  i where  y,(t)  is  u  approxi¬ 
mation  to  «(i }  A)  and  At  -  Addict  (1,-24)-  It  is  well-known  that  the  eigenvalues  at  Aim 

{  2A-*[coa<I  Aw)  - 1] :  I  -l^s.  }.  (42) 

All  the  eigenvalues  are  negative  and  are  distributed  throughout  an  interval  from  approxi¬ 
mately  — w*  to  approximately  -4A"*.  As  the  spatial  discretization  becomes  finer,  the  resulting 
system  of  OOEs  becomes  both  larger  ud  stiff er,  but  the  eigenvalues  of  the  associated  matrix 
Aj  do  not  cluster. 

Also  consider  a  similar  spatial  discretization  of  the  Heat  Equation  in  two  dimensions 
(2-D)  ud  thru  dimensions  (3-D),  each  with  homogeneous  Dirichlct  boundary  conditions.  For 
the  2-D  problem,  the  matrix  At  associated  with  the  recalling  linear  system  of  M  -  m*  ODEs 
y* m  Ajj  is  A,  -  A~*dteg (f where  ft  is  the  exo  identity  matrix  and 
T i  "  diet  (1,-44)-  Hence,  the  eigen valuse  of  At  are 

{  X,  +X,  ), 

where  X,  and  X;  are  eigenvalues  (42)  of  the  1-D  problem.  Similarly,  for  the  3-D  problem,  the 
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matrix  Ay  associated  with  the  resulting  linear  system  of  ii  -  m3  ODEs  y  « Ayy  is 
Ay  m  where  Iy  is  the  m2Xm2  identity  matrix.  By  »  dlog (/,  JyJd,  and 

Ty  -  dlay  (1,-04).  Hence,  the  eigenvalues  of  Ay  are 

{X,  +X,  +X4  :l"l,-)aj«l^1af»-lr^i  }, 
when  X|,  kJt  and  Xt  an  eigenvalues  (42)  of  the  1-D  problem. 

We  compare  the  computational-work  per  itep  required  by  each  of  the  ftte  stiff-ODE 
solvers  considered  above  to  integrate  the  spedally-diacretixed  1-D,  2-D,  and  3-D  Heat  Prob¬ 
lems.  The  numerical  remits  presented  in  the  next  section  show  that,  for  any  given  problem  in 
this  dess,  each  solver  requires  essentially  the  same  number  of  steps  throughout  the  numerical 
integration.  Thus,  for  each  solver,  the  computational-work  per  step  is  representative  of  the 
total  computational-work  required.  Moreover,  implicit  in  our  comparison  is  the  assumption 
that  each  stiff-ODE  solver  requites  the  same  number  at  Newton  iterations  per  step.  The  vali¬ 
dity  at  this  assumption  is  supported  by  the  numerical  results  also. 

The  computational-work  per  step  can  be  divided  into  three  components:  (1)  the  work  to 
factor  W i  far  the  GE  variants  or  to  compute  a  preconditioning  for  PCR  (if  W*  Is  refactored  or 
the  preconditioning  is  recomputed  on  that  step),  (2)  the  work  to  solve  (2.21)  using  either  the 
LU  factorization  for  the  GE  variants  or  the  (F)CR  method,  and  (3)  all  the  remaining  work 
per  step,  which  is  termed  the  computational-work  overhead.  We  measure  the  computational- 
work  for  each  operation  in  terms  at  the  number  of  arithmetic  operations  required  to  perform 
it. 

Similarly,  the  storage  required  by  each  solver  can  be  divided  into  two  components:  (1) 
the  storage  required  to  solve  the  system  of  linear  algebraic  equations  and  (2)  all  the  remaining 
storage,  which  is  termed  the  storage  overhead. 

Far  each  of  the  five  stiff-ODE  solvers  considered,  both  the  computational-work  sad 
storage  overheads  are  proportional  to  M,  the  dze  of  the  system  of  ODEs  solved.  Moreover,  in 
both  cases,  the  overhead  is  identical  far  each  solver. 
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la  determining  the  computation  al-work  and  storage  required  for  foil  GE,  we  — »»«» 
that  no  advantage  is  made  of  the  sparsity  of  the  matrices  Au  A2,  and  Ay  Thus,  in  <-«rh  case, 
to  factor  the  Newton  iteration  matrix  /  -A,  g,.*,,  i»UA  requires  computational-work  asymp¬ 
totically  proportional  to  ii3,  which  is  m3,  m*,  and  m9,  for  the  1-D,  2-D,  aad  3-D  problems, 
respectively.  la  each  case,  both  the  computation al-work  requited  to  solve  the  associated  sys¬ 
tem  of  linear  equations,  given  the  LU  factorization,  and  the  storage  needed  for  either  the 
Newton  iteration  matrix  or  its  LU  factorization  are  asymptotically  proportional  to  if3,  which 
is  «3,  as4,  aad  «*,  for  the  1-D,  2-D,  aad  3-D  problems,  respectively. 

The  half-band  widths  for  Au  A*  aad  A2  are  1,  m,  aad  si3,  respectively.  Thus,  in  each 
case,  to  factor  the  associated  Newton  iteration  matrix  using  band  OE  takes  enmpMfrfwwj. 
work  asymptotically  proportional  to  m,  m\  aad  m1,  respectively.  Also,  in  each  case,  both  the 
computational-work  to  solve  the  associated  system  of  linear  equations,  given  the  factorization, 
as  well  as  the  storage  required  for  either  the  matrix  or  its  factorization  are  proportional  to  m, 
si3,  and  a*1,  respectively. 

In  determining  the  computational-work  and  storage  required  for  sparse  GE,  we  ■««"» 
that  the  asymptotically  optimal  factorization  is  used,  although,  frequently,  this  is  not  the  case 
in  practice  for  the  2-D  and  3-D  problems  (cf.  [22,  23]).  Thus,  the  computational-work  to  fac¬ 
tor  the  Newton  iteration  matrix  /  -A,  g,A ,  l -1,2,3,  is  asymptotically  proportional  to  m,  m3,  or 
«*.  respectively.  In  each  case,  both  the  computational-work  to  solve  the  associated  system  of 
linear  equations,  given  the  factorization,  u  well  u  the  storage  required  for  either  the  Newton 
iteration  matrix  or  it*  factorization  are  proportional  to  m,  ai*lopi ,  and  si4,  respectively. 

As  stated  in  123,  to  compute  a  sufficiently  accurate  solution  for  the  inexact  chord- 
Newtoo  method,  it  is  generally  necessary  to  reduce  the  initial  residual  associated  with  (233) 
by  a  constant  factor  q  only,  where  q  is  typically  about  .L  From  (42),  the  spectral  condition- 
number  of  the  Newton  iteration  matrix  /-A,g,A«,  f  •  123,  increases  with  A,  from  1  at 
hi  -  0  to  4e"*A“3  as  A,  -  • .  Hence,  from  (323),  the  number  of  CR  iterations  required  to 
reduce  the  initial  residual  by  a  factor  of  q  is  at  most  [tof(2/q)w_1A-,j.  in  addition,  because 
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of  the  sparsity  of  Ait  A j,  and  AJt  the  number  of  arithmetic  operations  required  tor  each  CR 
iteration  is  proportional  to  M,  the  dimension  of  the  matrix.  Thus,  for  each  of  these  matrices, 
the  computational-work  requited  to  compute  a  sufficiently  accurate  solution  to  (233)  is  at 
moat  asymptotically  proportional  to  si3,  as*,  and  at4,  respectively,  and  the  storage  required  is 
asymptotically  proportional  to  m,  m3,  and  as3,  respectively. 

Far  either  the  SSOR  [42, 85]  or  MZLU  [41]  preconditioning  (with  the  appropriate  choice 
of  scalar  parameters),  the  spectral  condition-number  of  the  preconditioned  Newton  iteration 
matrix  increases  with  A,  from  1  at  A.  •  0  to  eh*1  (for  some  constant  c)  as  h,  »  •  [10J. 
Hence,  the  number  of  PCR  iterations  required  to  reduce  the  initial  residual  by  a  factor  of  n  is 
at  most  asymptotically  proportional  to  In  addition,  because  of  the  sparsity  of  each  New¬ 
ton  iteration  matrix  and  its  associated  preconditioning,  the  number  of  arithmetic  operations 
required  far  each  PCR  iteration  is  proportional  to  M,  the  dimension  of  the  system.  Thus,  in 
each  case,  the  campotatiaual-work  required  by  PCR  to  compote  a  sufficiently  accurate  solu¬ 
tion  to  (233)  is  at  moot  proportional  to  as1*,  as**,  and  as3*,  respectively.*  and  the  storage 
required  remains  proportional  to  m,  as3,  and  as3,  respectively.  Moreover,  in  each  caae,  the 
work  required  to  compute  the  MILU  factorization  is  proportional  to  the  number  of  nonzeros 
in  the  matrix,  m.  as3,  and  as3,  respectively,  while  no  work  at  all  is  required  to  'compute'  the 
traditional  form  of  the  SSOR  *factorizstion*. 


The  computational-work  estimates  given  above  are  biased  in  favour  of  the  CE  variants. 
During  the  initial  transient  for  each  problem,  the  steptixe  A,  b  'small*,  and,  consequently,  the 
condition  number  of  the  Newton  iteration  matrix  I-KPmA,,  1  •  123,  or  the  associated 
preconditioned  mntrix  b  'dose*  to  L  As  a  result,  the  computational  work  per  step  for  CR 
and  PCR  b  much  smaller  during  the  initial  transient  than  the  estimates  given  above  indicate: 
these  estimates  are  accurate  for  A,  'large'  only.  On  the  other  hand,  the  computational-work 
required  by  the  OB  variants  to  factor  and  solve  the  Newton  systems  b  independent  of  the 
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scepeize  A,.  In  addition,  even  for  A,  Targe*,  the  computational-work  estimates  for  PCR  do  not 
appear  to  be  optimal,  whereas  the  estimates  for  the  three  GE  variants  discussed  above  are 
optimal.  For  example,  Chan,  Jackson,  and  Zhu  [9]  show  that  there  is  strong  evidence  that,  if 
the  *AD>DKR*  preconditioning  is  used  for  the  2-0  problem,  then  the  condition  number  of  the 
preconditioned  Newton  iteration  matrix  is  asymptotically  proportional  to  and  that  the 
number  of  PCR  iterations  required  to  reduce  the  initial  residual  by  a  constant  factor  of  n  is 
asymptotically  proportional  to  Again,  because  of  the  sparsity  of  A]  and  the  associated 
AD-DKR  preconditioning,  the  number  of  arithmetic  operations  required  for  each  PCR  itera¬ 
tion  is  proportional  to  M,  the  dimension  of  the  system.  Thus,  there  is  strong  evidence  that, 
for  the  2-D  problem,  the  computational-work  required  by  PCR  to  compute  a  sufficiently  accu- 

jA  i 

rate  solution  to  (233)  is  at  mast  asymptotically  proportional  to  si  1 ,  rather  than  at**.  More¬ 
over,  both  the  storage  required  for  PCR  and  the  computational-work  needed  to  compute  the 
AD-DKR  incomplete  factorization  remain  asymptotically  proportional  to  at*. 

The  computational-work  and  storage  required  for  each  of  the  live  stiff- ODE  solvers  is 
summarized  in  Table  43.  These  estimates  show  that,  for  this  class  of  problems,  the  user 
should  take  advantage  of  sparsity:  full  GE  is  not  competitive  with  either  band  or  sparse  GE. 
For  the  1-D  problem,  band  (sparse)  GE  is  the  most  effective  method.  On  the  other  hand,  for 
the  2-D  problem,  both  CR  and  PCR  require  asymptotically  less  storage  than  any  of  the  GE 
variants  and  are  asymptotically  faster  than  either  full  or  band  GE.  However,  it  is  not  dear 
which  of  PCR  or  sparse  GE  is  asymptotically  faster.  The  answer  to  this  question  depends  on 
how  frequently  the  linear  systems  must  be  refactored  as  the  stepsizc  increases  during  the 
courae  at  the  numerical  integration  when  sparse  GE  is  used  as  well  as  the  proportion  of  steps 
taken  in  the  transient  region  where  A,  is  ’small*  and  PCR  requires  less  computational-work 
per  step  than  the  estimates  in  Table  4.1  indicate.  For  the  3-D  problem,  though,  PCR  is  asymp¬ 
totically  tamer  than  any  of  the  other  methods  and  requires  significantly  less  storage  than  any 
of  the  GE  variants. 


34- 


1 _ _ 

band  GE 

^soan^GI^ 

jpni 

PCR 

1 

factor 

- as 5 - 

m 

a 

• 

«J 

solve 

m2 

a 

a 

mm 

■3a 

WEST 31 

ni 

a 

a 

a 

a 

overhead 

a 

a 

n 

a 

a 

2-D 

factor 

m  1 

m* 

«r 

- 

mi 

•ohre 

-.4 

KOI 

mm 

■3a: 

storage 

s.3 

mama i 

mm 

mm 

QVCfhCid 

SI3 

as3 

m2 

s.1 

Ksjy 

3-D 

factor 

— as* — L 

mJ 

*• 

• 

Mm 

solve 

“JJjT 

m* 

SI' 

m* 

Cili 

wssrm 

si* 

n4 

m1 

mm 

^verhea^ 

m3 

mm 

wrm 

Table  4J:  The  principal  asymptotic  term  for  the  storage  for  sad  the 
coaputaional-work  per  step  repaired  to  factor  end  solve  the  linear  algebraic  sys¬ 
tems  that  arise  during  the  numerical  integration  of  the  spatially-discretized  Heat 
Problem,  as  well  ss  the  overhead  of  all  the  remaining  storage  and  camputaticnal- 
work  per  step  required  by  the  stiff -ODE  solver. 


5.  Numerical  Results. 

We  have  replaced  the  direct  linear-equation  solvers  in  LSODE  [48]  by  PCGPACK.  a  col¬ 
lection  of  preconditioned  Krylov  subspace  methods  implemented  by  Elman  [21.  23].  We  refer 
to  the  resulting  experimental  code  as  LSQDCG.  In  this  section,  we  report  some  preliminary 
numerical  experiments  with  LSODCG  to  test  the  effectiveness  of  iterative  linear-equation 
solvers  in  codes  for  large  systems  of  stiff  IV Ps  for  ODEs.  In  particular,  we  compare  the  per¬ 
formance  of  LSODCG  and  LSODES*  [49]  on  two  pairs  of  spatially-discretized  two-  and 
three-dimensional  linear  parabolic  problems  as  well  as  the  performance  of  LSODCG  and 
LSODE  on  the  thirty  Stiff  Detest  Problems  [27,  29].  Although  mast  of  the  Stiff  Detest  Prob¬ 
lems  an  not  large,  they  do  test  the  robustness  of  the  inexact  chord-Newton  method  and  the 

o 

associated  iterative  linear -equation  solvers  used  in  LSODCG.  These  preliminary  test  results 
look  quite  promising. 

4  LSOOtS  is  •  miss  at  LSODE  iscorporadae  tbs  Ysls  Spans  Matrix  hctip  (22,  23)  to  sob*  tbs  systsws  of 
It— or  iqomi  sqsadsas  by  a  spans  Arses  ■atasd. 
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5.1.  LSODE,  LSODES,  aad  LSODCG. 

We  developed  two  variiats  of  LSODCG:  LSODCG.V1  tad  LSODCG.V2.  la  the  former, 
we  did  not  make  say  modifications  to  the  formulas,  strategies,  or  heuristics  used  in  LSODE 
other  than  those  modifications  that  were  necessary  to  interface  LSODE  and  PC G PACK,  such 
as  changing  the  data  structure  for  storing  matrices  ia  LSODCG  to  the  sparse  TA-JA-A' 
representation  used  ia  PCGPACK  aad  many  other  sparse  linear-equation  solvers.  In 
LSODCG.V2,  we  made  one  additional  modification  to  LSODE  in  hope  of  reducing  the 
number  of  inexact  chord-Newton  iterations  aad  associated  function  evaluations  throughout 
the  course  of  the  numerical  integration:  each  time  A.S,  is  changed  in  LSODCG.V2,  this  «•»»»«• 
factor  is  updated  ia  the  Newton  iteration  matrix  /-*,&,/  without  re-evaluating  /,  and,  if  a 
precon ditioccr  is  being  used  in  PCGPACK,  it  is  recomputed.  Since  the  Jacobian  approxima¬ 
tion  J  is  not  re-evaluated,  these  updates  are  relatively  cheap  compared  to  solving  the  associ¬ 
ated  linear  algebraic  equations.5  Ia  LSODE,  LSODES,  and  LSODCG.V1,  on  the  other  hand, 
the  Newton  iteration  matrix  is  updated  only  when  the  magnitude  of  the  relative  ehangg  in 
Kfim  is  greater  than  CCMAX,  a  constant  set  to  3  in  each  of  these  three  codes.  Whenever  the 
Newton  iteration  matrix  is  updated,  either  it  is  refactored  in  LSODES  or,  if  a  preconditioner 
is  being  used  ia  PCGPACK,  the  preconditioner  is  recomputed  in  LSODCG.V1.  In  LSODE 
aad  LSODCG.VI,  the  Jacobian  approximation  J  is  re-evaluated  whenever  the  Newton  itera¬ 
tion  matrix  is  updated;  in  LSODES,  the  Jacobian  is  re-evaluated  only  when  it  is  estimated  to 
be  a  poor  approximation  to  the  current  Jacobian. 

In  all  four  codes,  the  acceptance  criterion  for  the  Newton  iteration  is  of  the  form  (2JJ) 
with  c i -CONIT ■  JfQ^2 •  whCTe  *•  !tle  order  of  the  BDF  in  use.  In  both  variants  of 

LSODCG,  we  use  a  stopping  criterion  for  the  iterative  linear-equation  solver  in  the  inexact 
chord-Newton  method  of  the  form  (2.3 J).  Our  numerical  experiments  show  that  any  r  ia  the 

5  UpSadas  the  Meatus  ittnttoa  mots  weald  be  eves  thwpsr  if  LSOOCO.V2  sored  -■*-—/  -J  rather  thm 
/  -A,  .  ThU  cheats  ia  eeey  to  implewest. 
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ra&fe  [J..5]  is  quite  satisfactory:  unaller  vaioes  of  r  in  this  range  lead  to  more  PCGPACK 
iterations  per  inexact  chord-Newton  iteration,  but  frequently  lead  to  fewer  inexact  ehord- 
Newton  iterations  resulting  fat  fewer  function  evaluations.  Some  numerical  results  along  this 
line  are  reported  in  the  third  subsection.  Also,  as  mentioned  in  I2J,  we  take  -F(yf),  rather 
than  0,  as  an  initial  guess  for  yj^-yj  in  (133)  for  both  variants  of  LSODCG. 


Consider  the  Heat  Equation  in  two  dimensions  (2-D) 


hr  -«»  +*jy 

and  three  dimensions  (3-D) 

(333) 

%■*■+*•>»  +■» 

and  the  Convection-Diffusion  Equation  in  2-D 

(333) 

■»  **■  +*» 

and  3-D 

(333) 

%*hw+«l»+«bT+«S+*W+». 

(33.4) 

each  with  homogeneous  Dirichlet  boundary-conditions  either  on  the  unit  square  [0,l]x[0,l]  for 
the  2-D  problems  or  on  the  unit  cube  [0.1]  x  [0,1]  x  [0,1]  for  the  3-D  problems  and  initial  condi¬ 
tions  for  r  <  (0,1024] 


•CUdO  -  16a(l-xjy(l“7) 

for  the  2-D  problems  and 

a  (0*  j  j)  -  64a  (1-Jt  )y  (1-y  )s  (1-r  ) 

for  the  3-D  problems.  As  described  in  14,  applying  the  method  of  lines  to  the  Heat  Equation 
with  m+1  evenly  spaced  grid  points  in  each  dimension  ordered  in  the  usual  Icft-to- right 
bottom-to-top  manner  and  uaing  the  usual  three-pout  second -order  cent  ered-difference 

q^umrinutien  to  the  -conduct  derivmive.  with  steprixe  A  -  ^  yieldi  s  »y«em 
of  stiff  ODEs  of  the  form 

y'(*)-  A  y(t),  (333) 


a 


where  A  is  a  constant  symmetric  negative-definite  matrix  with  A  ■  A2  of  dimension  14  »  m1 
for  the  2-D  problem  and  A  •  A  j  of  dimension  M  *  m1  for  the  3-D  problem.  Applying  the 
method  of  lines  to  the  Convection-Diffusion  Equation  in  a  similar  way,  bat  with  the  addition 
of  the  two-point  second-order  cent crcd-d iff crence  approximation  to  the  first-order  spatial 
derivatives,  also  yields  a  system  of  stiff  DDEs  of  the  form  (5.25),  where  again  A  is  a  constant 
matrix  of  dimension  Af  -  m2  for  the  2-D  problem  and  of  dimension  il  »  at*  for  the  3-D  prob¬ 
lem.  In  this  case,  though,  A  is  a  nonsymmetrie  negative-real  matrix  for  both  the  2-D  and  3-D 
problems. 

The  eigenvalues  and  eigenvectors  of  the  matrix  A  associated  with  the  spatially- 
discretized  Heat  Equation  (5 25)  are  well-known.  Therefore,  the  exact  solution  of  the  associ¬ 
ated  IVP  can  be  calculated  easily  for  any  t.  For  the  Convection-Diffusion  Problem,  we  used 
EISPACK  [31,  78]  in  doable  precision  on  an  IBM  3033  to  ealenlate  the  eigenvalues  and  eigen¬ 
vectors  of  the  matrix  A  associated  with  the  spatially-discretized  1-D  problem  of  the  form 
(525).  Since  the  solution  of  the  spatially-discretized  2-D  and  3-D  problems  can  be  writted  as 
the  tensor  product  of  solutions  of  the  associated  1-D  problems,  the  exact  solution  of  the 
qmtially-discretized  2-D  and  3-D  Convection-Diffusion  Problems  can  be  computed  easily  for 
any  t  also. 

We  used  LSODES,  LSODCG.V1,  tad  LSODCG.V2  on  an  IBM  3033  computer  in  double 
precision  to  compute  numerical  solutions  of  the  2-D  problems  for  m  »  5,  10,  15, 20, 25, 30  and 
the  3-D  problems  for  m  ■  3,  5,  7,  9.  In  etch  case,  we  used  the  BDFs  with  exact  Jacobians 
(MF«21)  and  an  absolute  local  error  tolerance  of  ATOL  m  IQ-1  (ITOL»l  and  RTOL*0).  We 
integrated  from  the  initial  point  t-0  to  the  output  points  T  -2*  /100,  for  i-0,l,2,_,10,  using  the 
continuation  option  (1ST ATE  *2)  to  integrate  from  one  intermediate  output  point  to  the  next. 
Became  we  did  not  require  the  output  points  to  be  hit  exactly  (ITASKal),  the  solution  vector 
is  oomputed  by  interpolation  and,  on  occasion,  more  than  one  solution  vector  is  computed  per 
integration  step,  as  can  be  seen  In  some  of  the  numerical  results  presented  below.  No  optional 
input  (IOPT-0)  was  used. 


We  need  the  PCGPACK  implementation  of  the  Preconditioned  Conjugate  Residual 
(PCX)  method  [25]  and  the  Preconditioned  Orthomia(k)  (POR(ls))  method  [21, 25]  for  k-1,3,5 
to  aoho  the  linear  algebraic  equations  in  LSODCQ.  For  each  of  these  methods,  we  used  one 
of  the  three  PCGPACK  preconditionings: 

L  NOPKE  *  no  preconditioning, 

2.  TCSSOR  •  the  two-cyclic  implementation  [19]  of  the  SSOR  preconditioning,  or 

3.  TCDKR  •  the  two-cyclic  implementation  [19]  of  the  DKX  [18]  incomplete  factorixation, 
more  generally  referred  to  as  the  Modified  Incomplete  LU  (MILU)  factorization  [41]. 

Far  the  TCSSOR  preconditioning,  we  used  «t»2/[l+sin(*V2)],  where  is  the  spatial 

stepaize.  This  value  of  o*  is  "near  optimal*  [85]  for  the  spatially-discretized  2-D  and  3-D  Heat 
Equations.  Although  this  value  of  o  may  not  be  'near  optimal*  for  the  spatially-discretized 
Convection-Diffusion  Equation,  it  is  appropriate  in  this  case  as  well,  since,  in  practice,  an 
optimal  value  of  ss  for  the  problem  to  be  solved  is  typically  not  known.  For  the  TCDKR 
preconditioning,  we  used  <x«0  for  all  problems,  as  recommended  by  Chandra  [10].  In 
Orthamin(k),  the  preconditioning  was  applied  mi  the  right  as  described  in  132.  In  both  vari¬ 
ants  of  LSODCG,  we  used  a  stopping  criterion  of  the  form  (233)  with  r«3  for  each  iterative 
linear-equation  solver.  However,  we  also  set  the  maximum  PCGPACK  iterations  permitted  to 
solve  any  one  linear  system  to  maz(100,10m). 

533.  Detailed  Numerical  Results  for  One  Problem. 

Detailed  results  for  the  numerical  solution  of  the  spatially-discretized  2-D  Convection- 
Diffusion  Problem  with  m-30  using  L5GDES,  LSODCG.V1,  and  LSODCG.V2,  respectively, 
are  given  in  Tables  52.13,  5232,  and  5233.  The  linear-equation  solver  used  in  LSODCG  is 
POR(l)  preconditioned  by  TCDKR.  These  numerical  results  are  representative  of  the  perfor¬ 
mance  of  these  three  codes  on  the  problems  considered  in  this  subsection. 


In  each  table, 


r 
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•  Tiatht  output  poiat, 

•  ERROR  Is  the  root-mean- square  norm®  of  the  difference  between  the  numerical  and  exact 
solutions  to  the  problem  at  T, 

-  HU  and  NQU,  respectively,  an  the  stepsixe  aad  order  used  by  the  BDF  is  the  last  step 
tahea  to  reach  T,  aad 

-  NST,  NFE,  aad  NIB,  respectively,  an  the  total  number  of  steps,  function  evaluations,  and 
Jacobian  evaluations  used  from  the  initial  poiat  t*0  to  the  current  output  poiat  T. 

Note  also  that  NFE  <-1  is  the  number  of  Newton  iterations  used  from  the  initial  point  t*0  to 
the  current  output  point  T,  dace  all  but  the  first  function  evaluation  is  associated  with  a 
Newton  iteration.  For  LSODES,  NLU,  MLTFAC,  aad  MLTSLV,  respectively,  an  the  total 
number  of 

•  LU  factorizations  used, 

•  multiplies  used  in  the  LU  factorizations,  and 

-  multiplies  used  in  forward  and  backward  substitutions 

by  the  Yale  Sparse  Matrix  Package  to  solve  the  linear  equations  that  arise  in  the  numerical 
integration  from- the  initial  point  t-0  to  the  output  poiat  T.  For  LSODCG,  NFRE  aad 
ITSTOT,  respectively,  an  the  total  number  of 

•  prcconditiontap  computed,  aad 

•  iterations  used  by  the  Uaear-equatioa  solvers 

to  integrate  from  the  initial  point  t«0  to  the  output  point  T.  ITSMAX  is  the  "»™"" 
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Also  shown  in  these  tables  is  the  storage  required  by  each  of  the  three  ODE  solvers.  Ia 
each  case,  STRMAT  is  the  n saber  of  noaneros  ia  the  matrix  A  —pasted  with  the  ODE 
(325).  For  LSODES,  STRFAC  is  the  somber  of  noaxcros  ia  the  LU  factorizatioo  compoted 
by  YSMP.  For  LSODCO,  STRPRE  sad  STRMTH,  respectively,  ere  the  number  of  nonzeros 
required  to  store  the  preconditioning  (M  for  TCSSOR  or  TCDKR  sad  1  for  NOPRE)  sad  the 
additional  storage  seed  ia  the  iterative  method  fl4+2k]M  +  2k  for  POR(k)  aad  4M  far  FCR). 
For  both  L50DES  sad  LSODCO,  STRTOT  is  the  total  number  of  storage  locations  required 
for  the  linear  equation  solvers.7  For  LSODES,  STRTOT  “  2  STRMAT  +  2-STRFAC  +  11- M 
+  2,  while,  for  LSODCO,  STRTOT  -  2- STRMAT  4-  STRMTH  +  STRPRE  +M  +  1. 

The  values  of  ERROR.  HU,  NQU,  aad  NST  are  very  similar  for  all  three  codes.  From 
this  we  deduce  that,  for  this  cl—  of  problems  at  least,  the  error-control,  stepsize -selection , 
aad  order-selection  strategies  ia  LSODE  are  not  significantly  affected  by  the  use  of  aa  itera¬ 
tive  linear-equation  solver.  Although  NFE  also  is  similar  for  all  three  codes,  it  is  7-10% 
smaller  for  LSODCG.V2  than  far  cither  of  the  other  two  codes  indicating  that  the  use  of  the 
current  value  of  ia  the  Newton  iteration  matrix  /-h,p,7  reduces  slightly  the  total 
number  of  Newton  iterations  required  throughout  the  integration. 

The  difference  in  NJE  for  LSODES  aad  LSODCQ.Vl  demonstrates  the  superiority,  for 
this  dm  of  problems  at  least,  of  the  strategy  need  ia  LSODES  over  the  one  used  in 
LSODCO.Vl  (taken  without  modification  from  LSODE)  for  determining  when  a  Jacobian  re- 
evaluation  is  required.  LSODCO. V2  uses  two,  rather  than  one,  Jacobian  evaluations  because 
we  did  not  alter  LSODE't  strategy  that  forces  a  Jacobian  re-evaluation  every  MSBF  (-20) 
steps.  If  this  requirement  were  removed  from  LSQDCG.V2,  then  it  too  would  use  only  one 
Jacobin  evaluation  throughout  the  course  of  the  integration,  as  it  should  for  this  cl—  of 
problems. 

7  Ws  so—  sseb  doable  pwddsc  aad  lengw  variable  m  oes  ear—  leetdos  iltbosfb.  os  tbs  IBM  30^.,  .  t  doe- 
Me  ynddns  sddb  isqaine  M  words  of  Congo  wb—  sseb  letagw  vailiMo  rsqWra  ooiy  oss.  Hosrsvw, 
tbte  asks  dflract  is  As  oospwlios  of  tbo  sotifi  tsqairsd  bj  iiontlvt  aad  dirset  Basst  oq—  sotvsn 
daaa,  to*  •  |t*oa  pintilwr  both  tscbaSqsaa  —  —rcwlwitHy  tbs  mom  propordoa  of  latogsr  to  doabto  pm— 
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The  sequence  of  values  of  NLU  for  LSODES  and  both  NJE  and  NPRE  for  LSODCO.V1 
an  identical  indicatiin  that  both  codes  had  the  same  number  of  'significant*  changes  in  ft,fi» 
between  ends  peir  at  output  paints,  when  by  a  'significant*  change  we  mean  that  the  magni- 
tnde  of  the  relative  change  in  ft,fi,  is  greater  than  CCMAX  (- J).  The  values  of  NPRE  for 
LSODCG.V2  an  slightly  larger  than  those  for  LSODCG.VL  Thus,  assuming  that  the  stcpaize 
sequences  in  all  three  codes  wen  similar,  then  wen  some  changes  of  ft,fi.  in  LSODES  and 
LSODCG.V1  that  wen  not  'significant*.  Consequently,  on  some  steps  in  LSODES  and 
LSODCG.V1,  the  factor  ft.fi,  in  the  Newton  iteration  matrix  /-ft.fi.7  was  not  equal  to  the 
value  at  ft.fi.  used  in  the  BDF  on  that  step.  On  the  other  hand,  LSODCG.V2  updates  the 
factor  ft.fi.  in  the  Newton  iteration  matrix  whenever  ft.fi,  changes.  This  may  explain  why 
LSODCG.V2  used  fewer  Newton  iterations  (NFE-1)  than  either  of  the  other  two  codes.  As  a 
remit,  MLTTOT  is  smaller  for  LSODCG.V2  than  LSODCG.V1  at  each  output  point  even 
though  LSODCG.V2  re-computed  the  TCDKR  precondition er  more  frequently  than 
LSODCG.V1  did:  the  reduction  in  the  number  of  Nearton  iterations  and  associated  linear- 
system  solves  more  than  offset  the  additional  preconditioner  computations. 

The  final  value  of  MLTTOT  is  approximately  the  same  for  all  three  codes.  However, 
during  the  initial  sages  of  the  integration,  MLTTOT  for  the  two  variants  of  LSODCG  is 
■gnificantly  less  than  for  LSODES.  For  these  steps,  ft,  is  small  and  the  spectrum  of  Z-ft.fi,  .7 
is  clustered  around  1.  Consequently,  only  a  few  POR  iterations  (XTSMAX)  are  required  to 
solve  each  linear  system.  However,  as  the  integration  proceeds  and  ft,  grows,  the  spectrum  of 
/—ft, fit/  txptndt  and  more  iterations  are  required  to  solve  each  linear  system.  However,  for 
k,  >  L  1TSMAX  does  not  grow  significantly  with  A,,  since,  as  a  rule  of  thumb,  it  is  the  rela¬ 
tive  sine  of  the  eigenvalues  to  one  another,  rather  than  the  absolute  size  at  the  eigenvalues, 
that  determines  the  rate  of  convergence  of  most  Krylov  subspace  methods,  and  the  relative 
aim  of  the  eigenvalues  does  not  change  significantly  with  ft,  for  ft,  >  1.  For  LSODES, 
MLTFAC  is  approximately  two- thirds  of  MLTTOT,  and  this  factor  grows  as  the  grids  become 
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Althongh,  for  this  problem,  etch  code  requires  approximately  the  toe  amount  of 
computational-work,  STRTOT  for  LSODES  is  about  four  times  the  value  of  STRTOT  for 
either  variant  of  LSODCG.  Moreover,  this  factor  grows  exponentially  as  the  grids  become 
finer.  In  addition,  note  that,  for  LSODES,  STRFAC  is  about  five  times  as  large  as  STRMAT. 
On  the  ocher  hand,  for  LSODCG  with  POR(l)  preconditioned  by  TCDKR  (or  TCSSOR). 
STRFRE  is  about  one  eighteenth  of  STRTOT,  since  the  TCDKR  (or  TCSSOR)  precondi¬ 
tioner  requires  only  one  M-vector  of  storage. 

A  Summary  of  the  Numerical  Remits  for  AO  the  Test  Problems. 

We  present  below  a  summary  of  the  numerical  results  for  LSODES  and  LSODCG. V2 
n««wg  the  PCGPACK  linear-equation  solvers  PCR  and  POR(k),  k»l ,3,5,  preconditioned  by 
NOPRE,  TCSSOR,  and  TCDKR  far  the  four  spatially-discretized  parabolic  problems onexa 
grids,  m  *5,10,15  _J0,  for  the  2-D  problems,  and  axsx«  grids,  m-3,5,7,9,  for  the  3-D  prob¬ 
lems.  Since  the  numerical  results  for  LSODCG. V2  are  similar  to,  but  generally  better  than, 
those  for  LSODCQ.V1.  we  have  not  included  a  summary  of  the  numerical  results  for  the 
latter  code. 

The  total  number  of  PCGPACK  iterations,  FTSTOT,  and  the  maximum  number  of  itera¬ 
tions  far  any  one  solve,  ITSMAX.  used  by  LSODCG.V2  throughout  the  integration  are  listed 
in  Tables  522A  and  5222  and  Tables  3 223  and  522.4,  respectively.  Graphs  of  m  against 
ITSMAX  on  a  log-log  scale  for  POR(l)  preconditioned  by  NOPRE,  TCSSOR,  and  TCDKR 
are  given  in  Plots  52.2.1  and  5222  for  the  two  2-D  problems. 

The  total  number  of  multiplies,  MLTTOT,  used  by  LSODES  and  LSODCG.V2  to  solve 
the  linear  algebraic  systems  throughout  the  integration  are  listed  in  Tables  5223  and  5223. 
Graphs  of  m  against  MLTTOT  on  a  log-log  scale  for  LSODES  and  LSODCG.V2  with  POR(l) 
preconditioned  by  NOPRE  and  TCDKR  are  given  in  Plots  5223  to  5226  for  all  four  prob- 


The  total  storage,  STRTOT,  required  by  the  linear-equation  solvers  in  LSODES  and 


LSODCQ.VZ  for  the  2-0  and  3-D  problems  is  given  in  Table  522.7.  (Each  linear  cq nation 
•elver  requires  the  tame  amount  of  *orage  for  both  of  the  2-D  problems  as  veil  as  the  same 
amount  of  itorage  for  both  of  the  3-D  problems.)  Graphs  of  m  against  STRTOT  on  a  log-log 
acale  for  LSODES  and  LSODCG.V2  with  PQR(1)  preconditioned  by  NOPRE  and  TCDKR  (or 
TCSSOR)  are  given  in  Plots  522.7  and  3 222  for  the  2-D  and  3-D  problems. 

An  entry  of  **"  in  piece  of  a  number  in  these  tables  indicates  that,  during  the  course  of 
the  integration,  the  associated  iterative  linear-equation  solver  failed  to  converge  in  the  max¬ 
imum  number  of  iterations  allowed,  max(100,10m).  Only  PCR  with  no  preconditioning  failed 
to  converge,  and  it  failed  on  the  spatially-discretized  2-D  Convection-Diffusion  Problem  with 
m“10  and  15  only.  It  is  in  fact  surprising  that  PCR  did  not  fail  on  more  of  the  Convection- 
Diffusion  Problems,  since  the  linear  systems  associated  with  these  problems  are  nonsymmetric 
and  PCR  is  not  (in  theory  at  least)  applicable  to  such  systems. 

Consider  the  results  for  LSODCG.V2  first.  For  these  test  problems,  POR(l)  is  the  moat 
effective  of  the  four  baric  FCGPACK  methods  considered.  For  a  given  problem  and  precon¬ 
ditioning,  MLTTOT  for  POR(k),  k-1,3,5,  generally  increases  with  k  even  though  1TST0T 
often  decreases  with  k:  the  reduction  in  the  number  of  iterations  is  more  than  offset  by  the 
additional  work  required  per  iteration  as  k  increases.  As  mentioned  above,  PCR  failed  on 
two  problems  and  is  not  guaranteed  to  converge  for  any  nonsymmetric  linear  system.  Further¬ 
more,  for  the  symmetric  Heat  Problems,  PCR  is  nor  significantly  more  efficient  than  POR(l). 
On  the  contrary,  when  preconditioned,  PCR  frequently  requires  more  multiplies  than  POR(l) 
■nee,  even  though  PCR  may  require  fewer  iterations,  fewer  multiplies  ate  required  per  itera¬ 
tion  to  precondition  P0R(1)  on  the  right  than  to  precondition  PCR  symmetrically,  as  is 
required  for  the  latter  method. 

Of  the  three  preconditioning!,  TCDKR  is  nearly  always  the  most  effective  in  terms  of 
both  multiplies  and  iterations  required.  The  effectiveness  of  preconditioning  is  much  more 
pronounced  for  the  nonsymmetric  Convection-Diffusion  Problems  than  for  the  symmetric 
Heat  Problems.  In  fset,  for  the  latter  data  of  problems,  MLTTOT  for  TCSSOR  is  frequently 
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targer  than  (or  NOPRE  (or  the  same  basic  PCGPACK  method,  since  the  additional  work 
required  to  precondition  is  not  offset  by  a  sufficient  reduction  in  the  number  aC  PCGPACK 
iterations  used  throughout  the  integration.  This  is  not  the  case  tor  TCDKR.  Although 
NOPRE  required  (ewer  multiplies  than  TCDKR  on  some  coarse  grid  problems,  the  difference 
is  never  significant.  On  the  other  hand,  TCDKR  is  frequently  substantially  more  effective 
than  NOPRE  in  terms  of  both  multiplies  and  iterations  required  by  PCGPACK.  Although 
nSMAX  (or  TCDKR  and  TCSSOR  are  frequently  close,  ITSTOT  (or  TCDKR  is  usually 
significantly  leas  than  (dr  TCSSOR,  indicating  that  TCDKR  is  substantially  more  effective 
than  TCSSOR  on  the  large  number  of  linear  algebraic  systems  for  which  A,  is  small  and  the 
spectrum  of is  clustered  around  L 

From  the  graphs  of  ITSMAX  in  Plots  5222  and  5222,  it  can  be  seen  that  not  only  do 
the  preconditioned  POR(l)  methods  require  fewer  PCGPACK  iterations  than  POR(l)  with  no 
preconditioning  but  also  the  difference  grows  exponentially  with  m.  Although  not  shown 
here,  graphs  for  ITSTOT  are  similar,  but,  in  this  case,  TCDKR  can  be  seen  to  be  substantially 
more  effective  than  TCSSOR.  Graphs  of  MLTTOT  for  POR(l)  preconditioned  by  NOPRE 
and  TCDKR  are  shown  in  Plots  5222  to  5222.  Note  that  not  only  is  MLTTOT  significantly 
smaller  for  TCDKR  than  NOPRE,  except  on  the  coarsest  grids,  but  also  the  difference 
between  MLTTOT  for  these  two  preconditioninp  grows  exponentially  with  m. 

Now  compare  LSODES  to  LSODCG.V2  with  FOR(l)  preconditioned  by  TCDKR. 
Tables  5225  and  5222  and  Plots  5.2.23  to  533.6  reveal  that  LSODES  requires  fewer  multi¬ 
plies  than  LSODCG.V2  for  the  2-D  problems,  except  on  the  finest  grid  (m*30).  However,  the 
difference  decreases  with  m  and  an  extrapolation  of  the  graphs  in  Plots  5333  and  5333  sug¬ 
gests  that  LSODCG.V2  with  POR(l)  preconditioned  by  TCDKR  would  become  increasingly 
more  efficient  than  LSODES  for  these  two  2-D  problems  on  finer  grids.  For  the  two  3-D 
problems  with  m«9,  LSODES  requires  more  than  tea  times  as  many  multiplies  as 
LSODCG.V2  to  solve  the  linear  algebraic  equations  throughout  the  integration.  Moreover, 
from  Plots  5 22A  and  5333,  it  la  dear  that  this  factor  grows  exponentially  with  m. 


-52- 


This  support*  tad  extend*  oar  earlier  observation  baaed  on  theoretical  estimates  of  the 
computational- work  in  |4  that  iterative  method*  are  significantly  more  efficient  than  direct 
method*  for  solving  the  tpedafly-discrctiacd  3-D  Heat  Problem. 

Table  $22.7  and  Plot*  522. 7  and  5222  show  that,  for  both  the  2-D  and  3-D  problems, 
STRTOT  is  nguificantly  larger  for  LSODES  than  LSODCG.V2:  for  the  2-D  problems  with 
m*30,  LSODES  require*  approximately  3.7  time*  u  much  storage  u  LSODCG.V2  and,  for  the 
3-D  problem*  with  m»9,  LSODES  requires  approximately  6.4  times  as  much  storage  as 
LSODCG  .V2.  Moreover,  for  both  the  2-D  and  3-D  problems,  this  factor  grows  exponentially 
with  m. 

13.  sag  Pstsst  Problem*. 

We  need  LSODE,  LSODES,  LSODCG.VL  and  LSODCG.V2  an  an  IBM  3033  computer 
in  doable  precision  to  solve  the  30  Stiff  Detest  Problems  [27,  29].  Although  these  problems 
are  not  large,  they  do  test  the  robustness  of  the  inexact  chord-Newton  method  and  the  associ¬ 
ated  iterative  linear-equation  solvers  in  the  two  variants  of  LSODCG. 

For  each  of  the  tour  codes,  we  solved  the  Stiff  Detest  Problems  using  the  BDFs  with 
exact  Jacobiaas  (MF-21)  to  an  absolute  local  error  tolerance  of  ATOL  m  IQ-2, 10“*,  10-*,  and 
10"*  CRTOL-O  and  ITOL-1). 

For  LSODCG. VI  and  LSODCG.V2,  we  used  POR<5),  the  PCGPACK  [21.  26]  implemen¬ 
tation  of  Orrhomia(5)  (20,  24],  to  solve  the  linear  algebraic  systems  of  equations  that  arise  in 
the  inexact  chord-Newtoa  method.  We  did  not  precondition  POR(5)  because,  for  many  of  the 
Stiff  Deram  Problems,  an  incomplete  factorization  would  actually  yield  the  exact  factorization 
at  the  associated  Newton  iteration  matrix  and,  consequently,  POR(5)  would  generate  the 
com  solution  to  the  linear  algebraic  equations  in  one  iteration. 

We  used  a  stopping  criterion  at  the  form  (233)  with  r  ■  J,  25,  and  3  tat  the  solution 
of  the  linear  algebraic  systems  arising  in  the  inexact  chord-Newton  method.  Since  the  Stiff 
Detest  Problems  are  small  and  the  tolerance  for  the  linear  algebraic  systems  is  lax,  we  allowed 


•  maximum  of  90  POR(5)  iterations  to  solve  each  linear  algebraic  system. 

We  present  oar  results  far  LSODE  and  LSODCG.V2  only.  As  in  the  previous  section, 
the  results  for  LSODCG.V2  are  generally  better  than  those  for  LSODCG.Vl,  and  the  stra¬ 
tegies  used  in  LSODCG.V2  are  closer  to  those  in  LSODE  than  those  in  LSODES. 

In  Tables  5.3.1  and  532,  respectively,  we  present  the  'normalised*  number  of  function 
evaluations  and  Jacobian  evaluations  required  by  LSODE  and  LSODCG.V2  with  r  »  J,  23, 
and  3  to  solve  each  of  the  30  Stiff  Detest  Problems  to  an  absolute  global  error  tolerance  of 
Tot  «  10**, 10“*,  sad  10“*  at  the  end-point  at  the  integration;  in  Table  533,  we  present  the 
'normalised*  total  number  of  POR(S)  iterations  required  by  LSODCG.V2  throughout  the 
integration.  These  normalised  statistics  were  calculated  by  a  new  version  of  the  Stiff  Detest 
Program  which,  as  described  in  [26],  first  performs  a  least  squares  fit  to 

2  poftftoboi  <rror()-log(C)-£‘log(A70£()p 
i-i 

for  C  and  E.  where,  in  this  case,  ATOL,  -  10“*. 10“*,  10"*,  and  10"*  and  NTOL  -  4.  The  Stiff 
Detest  *"r*«  then  performs  a  piecewise  linear  interpolation  on  the  actual  recorded  values 
of  the  coats  to  solve  the  IVP  at  AIOLt  versus  the  corresponding  expected  global  accuracy 
Tai  ■  C  ATOLg  to  strive  at  the  normalised  coats  for  an  absolute  global  error  tolerance  of 
ToL  (A  consequence  of  this  procedure  is  that  the  normalised  function  and  Jacobian  evalua¬ 
tions  an  negative  for  one  problem.) 

A  ***,  or  ¥  may  occur  as  an  entry  in  place  of  a  number  in  these  tables.  A  '•'  indi¬ 
cates  that  Stiff  Detest  could  not  calculate  the  normalised  statistics  for  this  problem  and  toler¬ 
ance  baaed  upon  the  actual  global  errors  incurred.  A  ***  indicates  that  the  method  being 
tested  (LSODE  or  LSODCG.V2)  could  not  solve  the  problem  at  that  tolerance,  and  a  'x*  indi¬ 
cates  that  Stiff  Detest  could  not  solve  the  problem  at  that  tolerance.  In  addition,  the  12  prob¬ 
lems  marked  with  a  '#*  have  a  Jacobian  that  is  not  negative-real  over  some  subinterval  of  the 
range  of  intepstiou. 

From  the  tables,  we  see  that  the  number  of  function  and  Jacobian  evaluations  typically 
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increase*  with  r.  For  the  Jacobian  evaluations,  the  increase  is  generally  not  ngnigeant,  hot, 
(or  the.  function  evaluations,  the  increase  is  frequently  10%  or  more  from  one  value  of  r  to 
the  nest.  On  the  other  hand,  the  number  of  POR(5)  iterations  typically  decreases  with  r  by  a 
factor  of  10%  or  more  Cram  one  value  of  r  to  the  nest.  Hence,  if  a  POR(5)  iteration  is  less 
expensive  than  a  function  evaluation,  then,  based  upon  these  results,  r*JL  would  usually  be 
the  moat  cost  effective  of  the  three  values  considered.  On  the  other  hind,  if  a  FOR(5)  itera¬ 
tion  is  substantially  more  expensive  than  a  function  evaluation  (as  is  the  esse  for  the  problems 
in  the  previous  subsection),  then,  based  upon  these  results,  r-J  would  usually  be  the  most 
east  effective  of  the  three  values  considered.  Thus,  the  choice  of  r  is  dependent  upon  the 
daaa  of  IVPs  solved. 

Except  for  problem  C3,  which  has  a  Jacobian  that  2a  not  negative-real,  LSODCG.V2, 
with  each  of  the  values  of  r  considered,  used  fewer  Jacobian  evaluations  than  LSODE  an  all 


problems  that  were  solved  successfully  by  both  codes.  Moreover,  for  those  IVPs  hiving  s 
negative-real  Jscobisn,  the  nnmber  of  Jseobitn  evelnstioos  required  differs  by  s  factor  of  2  to 
5.  This  seperiority  of  LSODCG.V2  over  LSODE  is  s  result  of  the  strategy  need  in 
LSODCG.V2  described  above  that  permits  it  to  update  the  scalar  factor  in  the  Newton 
iteration  matrix  /  whenever  A,  p,  changes  without  re-evaluating  the  Jacobian,  J.  If  we 

had  also  removed  from  LSODCG.V2  the  requirement  inherited  from  LSODE  that  the  Jaco¬ 
bian  be  re-evaluated  at  least  once  every  MSBP  (*20)  steps,  then  LSODCG.V2  would  have 
used  even  fewer  Jacobian  evaluations. 

Now  consder  the  function  evaluations  required  by  LSODE  and  LSODCG.V2  with  r»J 
to  solve  the  Stiff  Detest  Problems. 

LSODCG.V2  failed  to  solve  4  of  the  Stiff  Detest  Problems  (A2,  D3,  E4,  F5)  at 
Tol  m  io~*.  Each  of  these  problems  has  a  Jacobian  that  is  not  negative-real  over  some  subin¬ 
terval  of  the  range  of  integration.  However,  except  for  problem  F3  at  Tol  « 10~*, 
LSODCG.V2  required  fewer  function  evaluations  than  LSODE  for  these  problems  at 
Tol  -  10"*  sad  Ur*. 

Of  the  remaining  problems,  LSODCG.V2  with  r*J.  used  substantially  fewer  function 
evaluations  than  LSODE  far  7  of  the  Stiff  Detest  Problems  (Al,  A4,  Cl,  C3,  Dl,  D2,  E3). 
Again,  this  may  be  due  to  LSODCG.VTs  updating  the  scalar  factor  Aa&in  the  Newton  itera¬ 
tion  matrix  whenever  A.  {^changes  resulting  in  a  more  accurate  Newton  iteration  matrix  and  a 
more  tepid  convergence  at  the  Newton  iteration. 

LSODE  and  LSODCG.V2  used  approximately  the  same  number  of  function  evaluations 
on  11  of  the  Stiff  Deteat  Problems  (A3,  B2,  B3,  B4,  B3,  C2,  D 5,  D6,  F2,  F3,  F5).  It  is  worth 
noting  that  the  daas  B  problems  ate  of  the  form  y'«  Ay,  where  A  is  a  constant  matrix  with 
eigenvalues  and,  consequently,  A  is  far  from  being  symmetric. 

LSODE  used  substantially  fewer  function  evaluations  than  LSODCG.V2  on  7  problems 
(BL  C4,  C3,  D4,  El,  FI,  F2)  each  of  which  has  s  Jacobian  that  is  not  negative-real  over  some 
snbinterval  of  the  tangs  at  integration. 


Therefore,  except  for  thorn  problems  hiving  a  Jacobin  that  is  not  negative- real,  the  toe 
of  an  Iterative  li&ear-cqnaiion  solver  did  not  cause  the  performance  of  LSODCG.V2  to 
deteriorate  relative  to  the  unmodified  code  LSODE  which  incorporates  a  direct  linear- 
eqaatioe  solver,  h  fact,  LSOOCG.V2  performs  as  well  aa  or  better  than  LSODE  oa  ail  the 
Stiff  Data*  Problems  for  which  LSODCG.V2  is  applicable. 

One  final  point  is  worth  noting.  From  Tables  5.3.1  and  5. 33,  we  see  that,  for  many  of 
the  Stiff  Detest  Problems,  particularly  at  the  more  stringent  tolerances,  aa  average  of  tern  than 
one  PCGPACK  iteration  is  required  per  inexact  chord-Newton  iteration.  That  is,  for  many  of 
(he  inexact  chord-Ncwton  iterations,  the  initial  guess  - F(yf)  for  satisfies  (233)  and 

no  further  PCGPACK  iterations  are  required.  Hence,  when  usng  an  iterative  linear-equation 
solver  in  a  stiff-ODE  code  in  this  way,  we  automatically  obtain  the  benefit  of  the  use  of  an 
inexpensive  predictor-corrector  iteration  when  a  mote  expensive  Newton  iteration  is  not 
inquired.  Moreover,  this  appears  to  have  no  deleterious  effect  upon  the  overall  performance 
of  the  stiff-ODE  salver. 

i.  Ccnrleslma 

Both  the  theoretical  and  numerical  results  presented  in  the  preceding  two  sections  show 
that  the  on  of  iterative  Unear-equation  solvers  in  stiff -ODE  codes  has  the  potential  to 
Improve  the  efficiency  •  in  terms  of  both  computational-work  and  storage  -  with  which  a 
significant  dam  of  stiff  IVFh  having  large  sperm  Jacobian*  can  be  solved.  Moreover,  these 
remits  demonstrate  the  importance  of  preconditioning  for  Krylov  subspace  methods  used  in 
stiff-ODE  solvers. 

The  numerical  results  for  both  the  linear  and  nonlinear  IVft  show  that  the  (topping  cri¬ 
terion  (233)  for  the  inexact  chord-Newton  iteration  works  well  in  practice  for  r  €  [1,3],  This 

e 

supports  the  claim  that  the  Unear  equations  that  arias  in  stiff-ODE  solvers  seed  not  be  solved 
vary  accuratsiy.  Moreover,  the  initial  guesa  -F(yJ)  for  the  solotioo  y i*l-yi  at  the  Uncar  sys¬ 
tem  proved  to  be  quite  effective  in  practice,  particularly  during  the  initial  transient  where  the 


1VP  h  et  OKMt  oddly  sdff . 
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Updsting  the  scalar  factor  k,  (S,  in  the  Newton  Iteration  matrix / whenever  *,p. 
changes  without  re-evmluating  J ,  the  approximation  to  the  Jacobian,  reduces  the  number  of 
Newton  iteration!  and  amociated  function  evaluations  required  throughout  the  course  of  the 
numerical  integration  with  little  added  cost  in  a  stiff-ODE  code  incorporating  an  iterative 
linear-equation  solver.  Furthermore,  this  strategy  of  updating  A,  fi,  whenever  it  changes  facil¬ 
itates  the  decision  when  to  re-evaluate  the  Jacobian  and,  thus,  helps  to  avoid  wasted 
computational-work.  Mote  generally,  as  mentioned  in  823,  the  removal  of  the  constraint 
imp  ns  ml  by  the  necessity  to  avoid  refactoring  in  a  stiff-ODE  code  employing  a  direct 

linear-equation  solver  may  lead  to  other  benefits  in  the  choice  of  formulas,  strategies,  and 
heuristics  for  a  stiff-ODE  code  incorporating  an  iterative  Unear-equation  soiver. 

Moat  importantly,  the  numerical  results  demonstrate  that  stiff-ODE  codes  incorporating 
iterative  linear-equation  solvers  do  not  suffer  n  lorn  of  robustness  on  those  IVFs  for  which  the 
Newton  iteration  nutria  »,*  is  positive-real  throughout  the  course  of  the  numerical  integra¬ 
tion.  Note,  though,  that  this  restriction  on  w!  is  imposed  by  the  iterative  technique  we  chose 
to  solve  the  Unear  systems:  the  restriction  is  not  characteristic  of  til  stiff-ODE  codes  incor¬ 
porating  iterative  Unear-eqnatian  solvers.  In  particular,  as  mentioned  earlier,  there  exist  itera¬ 
tive  linear-equation  aotvera  that  are  guaranteed  to  converge  to  the  solution  of  the  Newton  sys¬ 
tem  (ZZ1)  it  all  the  eigenvalues  of  Wf  lie  in  the  right-half  complex  plane.  As  we  argued  in 
(2.1,  if  wi  does  not  satisfy  this  last  restriction,  then  the  stepsize  is  almost  surely  too  large  and 
should  bo  reduced  until  this  last  restriction  is  satisfied  to  ensure  ■  reliable  numerical  integra- 


Heacs,  it  appears  possible  to  develop  a  stiff-ODE  code  incorporating  an  iterative  linear- 
equation  solver  that,  for  a  broad  clam  of  IVPs,  is  as  robust  as  a  similar  stiff-ODE  code  incor¬ 
porating  n  direct  linear-equation  solver,  but  more  efficient  than  the  latter  code  for  a 
significant  subclass  of  problems  having  large  sparse  Jacobians.  We  plan  to  continue  to  pursue 
this  investigate  in  the  future. 
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